Show simple item record

dc.contributor.authorPopovic, Maja
dc.contributor.authorArcan, Mihael
dc.date.accessioned2019-02-05T14:33:18Z
dc.date.available2019-02-05T14:33:18Z
dc.date.issued2015-05-11
dc.identifier.citationPopovic, Maja, & Arcan, Mihael. (2015). Identifying main obstacles for statistical machine translation of morphologically rich South Slavic languages. Paper presented at the 18th Annual Conference of the European Association for Machine Translation (EAMT2015), Anatlya, Turkey, 11-13 May.en_IE
dc.identifier.urihttp://hdl.handle.net/10379/14907
dc.description.abstractThe best way to improve a statistical machine translation system is to identify concrete problems causing translation errors and address them. Many of these problems are related to the characteristics of the involved languages and differences between them. This work explores the main obstacles for statistical machine translation systems involving two morphologically rich and under-resourced languages, namely Serbian and Slovenian. Systems are trained for translations from and into English and German using parallel texts from different domains, including both written and spoken language. It is shown that for all translation directions structural properties concerning multi-noun collocations and exact phrase boundaries are the most difficult for the systems, followed by negation, preposition and local word order differences. For translation into English and German, articles and pronouns are the most problematic, as well as disambiguation of certain frequent functional words. For translation into Serbian and Slovenian, cases and verb inflections are most difficult. In addition, local word order involving verbs is often incorrect and verb parts are often missing, especially when translating from German.en_IE
dc.description.sponsorshipThis publication has emanated from research supported by the QT21 project – European Union’s Horizon 2020 research and innovation programme under grant number 645452 as well as a research grant from Science Foundation Ireland (SFI) under grant number SFI/12/RC/2289. We are grateful to the anonymous reviewers for their valuable feedback.en_IE
dc.formatapplication/pdfen_IE
dc.language.isoenen_IE
dc.publisherEuropean Association for Machine Translationen_IE
dc.relation.ispartof18th Annual Conference of the European Association for Machine Translation (EAMT-15)en
dc.subjectStatistical machine translationen_IE
dc.subjectSouth Slavic languagesen_IE
dc.subjectObstaclesen_IE
dc.titleIdentifying main obstacles for statistical machine translation of morphologically rich South Slavic languagesen_IE
dc.typeConference Paperen_IE
dc.date.updated2019-01-23T17:55:58Z
dc.local.publishedsourcehttps://aclanthology.info/papers/W15-4913/w15-4913en_IE
dc.description.peer-reviewednon-peer-reviewed
dc.contributor.funderHorizon 2020en_IE
dc.contributor.funderScience Foundation Irelanden_IE
dc.internal.rssid13192034
dc.local.contactMihael Arcan. Email: mihael.arcan@insight-centre.org
dc.local.copyrightcheckedYes
dc.local.versionPUBLISHED
dcterms.projectinfo:eu-repo/grantAgreement/EC/H2020::RIA/645452/EU/QT21: Quality Translation 21/QT21en_IE
dcterms.projectinfo:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/en_IE
nui.item.downloads6


Files in this item

Attribution-NonCommercial-NoDerivs 3.0 Ireland
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. Please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.

The following license files are associated with this item:

Thumbnail

This item appears in the following Collection(s)

Show simple item record