Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription
View/ Open
Date
2019-08-19Author
Chakravarthi, Bharathi Raja
Priyadharshini, Ruba
Stearns, Bernardo
Jayapal, Arun
Sridevy, S.
Arcan, Mihael
Zarrouk, Manel
McCrae, John P.
Metadata
Show full item recordUsage
This item's downloads: 287 (view details)
Recommended Citation
Chakravarthi, Bharathi Raja, Priyadharshini, Ruba, Stearns, Bernardo, Jayapal, Arun, Sridevy, S., Arcan, Mihael, Zarrouk, Manel, McCrae, John P. (2019). Multilingual multimodal machine translation for Dravidian languages utilizing phonetic transcription. Paper presented at the LoResMT 2019 : 2nd Workshop on Technologies for MT of Low Resource Languages (LoResMT 2019 at MT Summit XVII), Dublin, Ireland, 19-23 August.
Published Version
Abstract
Multimodal machine translation is the task
of translating from a source text into the
target language using information from
other modalities. Existing multimodal
datasets have been restricted to only highly
resourced languages. In addition to that,
these datasets were collected by manual
translation of English descriptions from
the Flickr30K dataset. In this work, we
introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30,000 sentences which were created utilizing several
machine translation outputs. Using data
from MMDravi and a phonetic transcription of the corpus, we build an Multilingual
Multimodal Neural Machine Translation
system (MMNMT) for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We
evaluate our translations generated by the
proposed approach with human-annotated
evaluation dataset in terms of BLEU, METEOR, and TER metrics. Relying on
multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the underresourced languages.