Search

Now showing items 1-10 of 14

A sentiment analysis dataset for code-mixed Malayalam-English

Chakravarthi, Bharathi Raja; Jose, Navya; Suryawanshi, Shardul; Sherly, Elizabeth; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)

There is an increasing demand for sentiment analysis of text from social media which are mostly code-mixed. Systems trained on monolingual data fail for code-mixed data due to the complexity of mixing at different levels ...

A comparative study of different state-of-the-art hate speech detection methods in Hindi-English code-mixed data

Rani, Priya; Suryawanshi, Shardul; Goswami, Koustava; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)

Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, ...

A survey of current datasets for code-switching research

Jose, Navya; Chakravarthi, Bharathi Raja; Suryawanshi, Shardul; Sherly, Elizabeth; McCrae, John P. (IEEE, 2020-03-06)

Code switching is a prevalent phenomenon in the multilingual community and social media interaction. In the past ten years, we have witnessed an explosion of code switched data in the social media that brings together ...

A multilingual evaluation dataset for monolingual word sense alignment

Ahmadi, Sina; McCrae, John P.; Nimb, Sanni; Khan, Fahad; Monachini, Monica; Pedersen, Bolette S.; Declerck, Thierry; Wissik, Tanja; Bellandi, Andrea; Pisani, Irene; Troelsgård, Thomas; Olsen, Sussi; Krek, Simon; Lipp, Veronika; Váradi, Tamás; Simon, László; Gyorffy, Andras; Tiberius, Carole; Schoonheim, Tanneke; Moshe, Yifat Ben; Rudich, Maya; Ahmad, Raya Abu; Lonke, Dorielle; Kovalenko, Kira; Langemets, Margit; Kallas, Jelena; Oksana, Dereza; Fransen, Theodorus; Cillessen, David; Lindemann, David; Alonso, Mikel; Salgado, Ana; Sancho, Jose Luis; Urena-Ruiz, Rafael-J.; Zamorano, Jordi Porta; Simov, Kiril; Osenova, Petya; Kancheva, Zara; Radev, Ivaylo; Stankovic, Ranka; Perdih, Andrej; Gabrovsek, Dejan (National University of Ireland Galway, 2020-05-16)

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography. In this paper, we describe our efforts in manually ...

A dataset for troll classification of Tamil memes

Chakravarthi, Bharathi Raja; Varma, Pranav; Arcan, Mihael; McCrae, John P.; Buitelaar, Paul; Shardul, Suryawanshi (European Language Resources Association (ELRA), 2020-05-11)

Social media are interactive platforms that facilitate the creation or sharing of information, ideas or other forms of expression among people. This exchange is not free from offensive, trolling or malicious contents ...

NUIG at TIAD: Combining unsupervised NLP and graph metrics for translation inference

McCrae, John P.; Arcan, Mihael (European Language Resources Association (ELRA), 2020-05-11)

In this paper, we present the NUIG system at the TIAD shard task. This system includes graph-based metrics calculated using novel algorithms, with an unsupervised document embedding tool called ONETA and an unsupervised ...

NUIG at TIAD 2021: Cross-lingual word embeddings for translation inference

Ahmadi, Sina; Ojha, Atul Kr.; Banerjee, Shubhanker; McCrae, John P. (2021-09-01)

Inducing new translation pairs across dictionaries is an important task that facilitates processing and maintaining lexicographical data. This paper describes our submissions to the Translation Inference Across Dictionaries ...

Towards automatic linking of lexicographic data: the case of a historical and a modern Danish dictionary

Ahmadi, Sina; Nimb, Sanni; McCrae, John P.; Sørensen, Nicolai H. (European Association for Lexicography, 2020)

Given the diversity of lexical-semantic resources, particularly dictionaries, integrating such resources by aligning various types of information is an important task, both in e-lexicography and natural language processing. ...

Challenges of word sense alignment: Portuguese language resources

Salgado, Ana; Ahmadi, Sina; Simões, Alberto; McCrae, John P.; Costa, Rute (National University of Ireland Galway, 2020-05-16)

This paper reports on an ongoing task of monolingual word sense alignment in which a comparative study between the Portuguese Academy of Sciences Dictionary and the Dicionario Aberto ´ is carried out in the context of the ...

Corpus creation for sentiment analysis in code-mixed Tamil-English text

Chakravarthi, Bharathi Raja; Muralidaran, Vigneshwaran; Priyadharshini, Ruba; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)

Understanding the sentiment of a comment from a video or an image is an essential task in many applications. Sentiment analysis of a text can be useful for various decision-making processes. One such application is to ...