Poor man’s lemmatisation for automatic error classification

Popovic, Maja; Arcan, Mihael; Avramidis, Eleftherios; Burchardt, Aljoscha; Lommel, Arle

dc.contributor.author	Popovic, Maja
dc.contributor.author	Arcan, Mihael
dc.contributor.author	Avramidis, Eleftherios
dc.contributor.author	Burchardt, Aljoscha
dc.contributor.author	Lommel, Arle
dc.date.accessioned	2019-02-04T15:33:29Z
dc.date.available	2019-02-04T15:33:29Z
dc.date.issued	2015-04-11
dc.identifier.citation	Popovic, Maja, Arcan, Mihael, Avramidis, Eleftherios, Burchardt, Aljoscha, & Lommel, Arle. (2015). Poor man’s lemmatisation for automatic error classification. Paper presented at the 18th Annual Conference of the European Association for Machine Translation (EAMT2015 ), Antalya, Turkey, 11-13 May.	en_IE
dc.identifier.uri	http://hdl.handle.net/10379/14902
dc.description.abstract	This paper demonstrates the possibility to make an existing automatic error classifier for machine translations independent from the requirement of lemmatisation. This makes it usable also for smaller and under-resourced languages and in situations where there is no lemmatiser at hand. It is shown that cutting all words into the first four letters is the best method even for highly inflective languages, preserving both the detected distribution of error types within a translation output as well as over various translation outputs. The main cost of not using a lemmatiser is the lower accuracy of detecting the inflectional error class due to its confusion with mistranslations. For shorter words, actual inflectional errors will be tagged as mistranslations, for longer words the other way round. Keeping all that in mind, it is possible to use the error classifier without target language lemmatisation and to extrapolate inflectional and lexical error rates according to the average word length in the analysed text.	en_IE
dc.description.sponsorship	This publication has emanated from research supported by QTLEAP project – ECs FP7 (FP7/2007- 2013) under grant agreement number 610516: “QTLEAP: Quality Translation by Deep Language Engineering Approaches” and by a research grant from Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289. We are grateful to the reviewers for their valuable feedbac	en_IE
dc.language.iso	en	en_IE
dc.publisher	European Association for Machine Translation	en_IE
dc.relation.ispartof	European Association for Machine Translation (EAMT-2015)	en
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subject	Lemmatisation	en_IE
dc.subject	Error classification	en_IE
dc.title	Poor man’s lemmatisation for automatic error classification	en_IE
dc.type	Conference Paper	en_IE
dc.date.updated	2019-01-23T17:53:30Z
dc.local.publishedsource	https://aclanthology.info/papers/W15-4914/w15-4914	en_IE
dc.description.peer-reviewed	non-peer-reviewed
dc.contributor.funder	Seventh Framework Programme	en_IE
dc.contributor.funder	Science Foundation Ireland	en_IE
dc.internal.rssid	13192048
dc.local.contact	Mihael Arcan. Email: mihael.arcan@insight-centre.org
dc.local.copyrightchecked	Yes
dc.local.version	PUBLISHED
dcterms.project	info:eu-repo/grantAgreement/EC/FP7::SP1::ICT/610516/EU/Quality Translation by Deep Language Engineering Approaches/QTLEAP	en_IE
dcterms.project	info:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/	en_IE
nui.item.downloads	78

Files in this item

Name:: license.txt
Size:: 5.659Kb
Format:: Text file

View/Open

Name:: final-lemmas.pdf
Size:: 147.1Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Data Science Institute (Conference Papers)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland