dc.contributor.author | Monti, Johanna | |
dc.contributor.author | Sangati, Federico | |
dc.contributor.author | Arcan, Mihael | |
dc.date.accessioned | 2019-02-04T15:09:56Z | |
dc.date.available | 2019-02-04T15:09:56Z | |
dc.date.issued | 2015-12-03 | |
dc.identifier.citation | Monti, Johanna, Sangati, Federico, & Arcan, Mihael. (2015). TED-MWE: a bilingual parallel corpus with MWE annotation: Towards a methodology for annotating MWEs in parallel multilingual corpora. Paper presented at the Second Italian Conference on Computational Linguistics (CLiC-it 2015), Trento, Italy, 3-4 December. | en_IE |
dc.identifier.isbn | 9788899200008. | |
dc.identifier.uri | http://hdl.handle.net/10379/14901 | |
dc.description.abstract | The translation of Multiword expressions (MWE) by Machine Translation (MT) represents a big challenge, and although MT has considerably improved in recent years, MWE mistranslations still occur very frequently. There is the need to develop large data sets, mainly parallel corpora, annotated with MWEs, since they are useful both for SMT training purposes and MWE translation quality evaluation. This paper describes a methodology to annotate a parallel spoken corpus with MWEs. The dataset used for this experiment is an English-Italian corpus extracted from the TED spoken corpus and complemented by an SMT output. | en_IE |
dc.description.sponsorship | We greatly acknowledge the PARSEME IC1207
COST Action for supporting this work. We are
particularly grateful to Manuela Cherchi, Erika
Ibba, Anna De Santis, Giuseppe Casu, Jessica
Ladu, Ilaria Del Rio, Elisa Virdis, Gino Castangia
for their annotation work. | en_IE |
dc.format | application/pdf | en_IE |
dc.language.iso | en | en_IE |
dc.publisher | Accademia University Press | en_IE |
dc.relation.ispartof | Second Italian Conference on Computational Linguistics (CLiC-it 2015) | en |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Ireland | |
dc.rights.uri | https://creativecommons.org/licenses/by-nc-nd/3.0/ie/ | |
dc.subject | TED-MWE | en_IE |
dc.subject | Bilingual parallel corpus | en_IE |
dc.subject | Multilingual | en_IE |
dc.title | TED-MWE: a bilingual parallel corpus with MWE annotation: Towards a methodology for annotating MWEs in parallel multilingual corpora | en_IE |
dc.type | Conference Paper | en_IE |
dc.date.updated | 2019-01-23T17:52:49Z | |
dc.identifier.doi | 10.4000/books.aaccademia.1514 | |
dc.local.publishedsource | https://dx.doi.org/10.4000/books.aaccademia.1514 | en_IE |
dc.description.peer-reviewed | non-peer-reviewed | |
dc.internal.rssid | 13192050 | |
dc.local.contact | Mihael Arcan. Email: mihael.arcan@insight-centre.org | |
dc.local.copyrightchecked | Yes | |
dc.local.version | PUBLISHED | |
nui.item.downloads | 86 | |