dc.contributor.author | Chatterjee, Rajen | |
dc.contributor.author | Arcan, Mihael | |
dc.contributor.author | Negri, Matteo | |
dc.contributor.author | Turchi, Marco | |
dc.date.accessioned | 2019-01-30T15:29:34Z | |
dc.date.available | 2019-01-30T15:29:34Z | |
dc.date.issued | 2016 | |
dc.identifier.citation | Chatterjee, Rajen, Arcan, Mihael, Negri, Matteo, & Turchi, Marco. (2016). Instance selection for online automatic post-editing in a multi-domain scenario. Paper presented at the The Twelfth Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2016), Austin, Texas, 28 October - 01 November. | en_IE |
dc.identifier.uri | http://hdl.handle.net/10379/14890 | |
dc.description.abstract | In recent years, several end-to-end online translation systems have been proposed to successfully incorporate human post-editing feedback in the translation workflow. The performance
of these systems in a multi-domain translation environment (involving different text genres,
post-editing styles, machine translation systems) within the automatic post-editing (APE) task
has not been thoroughly investigated yet. In this work, we show that when used in the APE
framework the existing online systems are not robust towards domain changes in the incoming
data stream. In particular, these systems lack in the capability to learn and use domain-specific
post-editing rules from a pool of multi-domain data sets. To cope with this problem, we propose
an online learning framework that generates more reliable translations with significantly better
quality as compared with the existing online and batch systems. Our framework includes: i) an
instance selection technique based on information retrieval that helps to build domain-specific
APE systems, and ii) an optimization procedure to tune the feature weights of the log-linear
model that allows the decoder to improve the post-editing quality. | en_IE |
dc.description.sponsorship | This work has been partially supported by the EC-funded H2020 project QT21 (grant agreement
no. 645452), and by the Science Foundation Ireland research grant (no. SFI/12/RC/2289). | en_IE |
dc.format | application/pdf | en_IE |
dc.language.iso | en | en_IE |
dc.publisher | Association for Machine Translation in the Americas | en_IE |
dc.relation.ispartof | The Twelfth Biennial Conference of the Association for Machine Translation in the Americas (AMTA 2016) | en |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Ireland | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/3.0/ie/ | |
dc.subject | Post-editing | en_IE |
dc.subject | Automatics | en_IE |
dc.subject | Multi-domain | en_IE |
dc.title | Instance selection for online automatic post-editing in a multi-domain scenario | en_IE |
dc.type | Conference Paper | en_IE |
dc.date.updated | 2019-01-23T17:43:13Z | |
dc.local.publishedsource | https://amtaweb.org/wp-content/uploads/2016/10/AMTA2016_Research_Proceedings_v7.pdf | en_IE |
dc.description.peer-reviewed | non-peer-reviewed | |
dc.contributor.funder | Horizon 2020 | en_IE |
dc.contributor.funder | Science Foundation Ireland | en_IE |
dc.internal.rssid | 13192047 | |
dc.local.contact | Mihael Arcan. Email: mihael.arcan@insight-centre.org | |
dc.local.copyrightchecked | Yes | |
dc.local.version | PUBLISHED | |
dcterms.project | info:eu-repo/grantAgreement/EC/H2020::RIA/645452/EU/QT21: Quality Translation 21/QT21 | en_IE |
dcterms.project | info:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/ | en_IE |
nui.item.downloads | 49 | |