Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription

Chakravarthi, Bharathi Raja; Priyadharshini, Ruba; Stearns, Bernardo; Jayapal, Arun; Sridevy, S.; Arcan, Mihael; Zarrouk, Manel; McCrae, John P.

dc.contributor.author	Chakravarthi, Bharathi Raja
dc.contributor.author	Priyadharshini, Ruba
dc.contributor.author	Stearns, Bernardo
dc.contributor.author	Jayapal, Arun
dc.contributor.author	Sridevy, S.
dc.contributor.author	Arcan, Mihael
dc.contributor.author	Zarrouk, Manel
dc.contributor.author	McCrae, John P.
dc.date.accessioned	2019-09-10T11:12:28Z
dc.date.available	2019-09-10T11:12:28Z
dc.date.issued	2019-08-19
dc.identifier.citation	Chakravarthi, Bharathi Raja, Priyadharshini, Ruba, Stearns, Bernardo, Jayapal, Arun, Sridevy, S., Arcan, Mihael, Zarrouk, Manel, McCrae, John P. (2019). Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription. Paper presented at the LoResMT 2019 : 2nd Workshop on Technologies for MT of Low Resource Languages (LoResMT 2019 at MT Summit XVII), Dublin, Ireland, 19-23 August.	en_IE
dc.identifier.uri	http://hdl.handle.net/10379/15415
dc.description.abstract	Multimodal machine translation is the task of translating from a source text into the target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced languages. In addition to that, these datasets were collected by manual translation of English descriptions from the Flickr30K dataset. In this work, we introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30,000 sentences which were created utilizing several machine translation outputs. Using data from MMDravi and a phonetic transcription of the corpus, we build an Multilingual Multimodal Neural Machine Translation system (MMNMT) for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We evaluate our translations generated by the proposed approach with human-annotated evaluation dataset in terms of BLEU, METEOR, and TER metrics. Relying on multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the underresourced languages.	en_IE
dc.description.sponsorship	This work is supported by a research grant from Science Foundation Ireland, co-funded by the European Regional Development Fund, for the Insight Centre under Grant Number SFI/12/RC/2289 and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 731015, ELEXIS - European Lexical Infrastructure and grant agreement No 825182, Pret- ˆ a-` LLOD.	en_IE
dc.format	application/pdf	en_IE
dc.language.iso	en	en_IE
dc.publisher	European Association for Machine Translation	en_IE
dc.relation.ispartof	Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages	en
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subject	Machine translation	en_IE
dc.subject	Dravidian languages	en_IE
dc.subject	Phonetic transcription	en_IE
dc.title	Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription	en_IE
dc.type	Workshop paper	en_IE
dc.date.updated	2019-08-29T08:13:58Z
dc.local.publishedsource	https://www.mtsummit2019.com/workshops	en_IE
dc.description.peer-reviewed	non-peer-reviewed
dc.contributor.funder	Science Foundation Ireland	en_IE
dc.contributor.funder	European Regional Development Fund	en_IE
dc.contributor.funder	Horizon 2020	en_IE
dc.internal.rssid	17436610
dc.local.contact	Bharathi Raja Asoka Chakravarthi, Insight Centre For Data Analytics, National University Of Ireland Galway . Email: b.asokachakravarthi1@nuigalway.ie
dc.local.copyrightchecked	Yes
dc.local.version	ACCEPTED
dcterms.project	info:eu-repo/grantAgreement/SFI/SFI Research Centres/12/RC/2289/IE/INSIGHT - Irelands Big Data and Analytics Research Centre/	en_IE
dcterms.project	info:eu-repo/grantAgreement/EC/H2020::RIA/731015/EU/European Lexicographic Infrastructure/ELEXIS	en_IE
dcterms.project	info:eu-repo/grantAgreement/EC/H2020::RIA/825182/EU/Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors/Pret-a-LLOD	en_IE
nui.item.downloads	287

Files in this item

Name:: license.txt
Size:: 5.659Kb
Format:: Text file

View/Open

Name:: LoresMT.pdf
Size:: 666.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Data Science Institute (Workshop Papers)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland