Now showing items 1-5 of 5

    • A comparative study of different state-of-the-art hate speech detection methods in Hindi-English code-mixed data 

      Rani, Priya; Suryawanshi, Shardul; Goswami, Koustava; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (European Language Resources Association (ELRA), 2020-05-11)
      Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, ...
    • Cross-lingual sentence embedding using multi-task learning 

      Goswami, Koustava; Dutta, Sourav; Assem, Haytham; Fransen, Theodorus; McCrae, John P. (Association for Computational Linguistics, 2021-11-07)
      Multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but also for catering to a broad range of downstream cross-lingual NLP tasks. State-of-the-art multilingual ...
    • Towards an integrative approach for making sense distinctions 

      McCrae, John P.; Fransen, Theodorus; Ahmadi, Sina; Buitelaar, Paul; Goswami, Koustava (Frontiers Media, 2022-02-07)
      Word senses are the fundamental unit of description in lexicography, yet it is rarely the case that different dictionaries reach any agreement on the number and definition of senses in a language. With the recent rise in ...
    • ULD@NUIG at SemEval-2020 Task 9: Generative morphemes with an attention model for sentiment analysis in code-mixed text 

      Goswami, Koustava; Rani, Priya; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (International Committee for Computational Linguistics, 2020)
      Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons. Recent advances in public communication over different social media sites have led to an ...
    • Unsupervised deep language and dialect identification for short texts 

      Goswami, Koustava; Sarkar, Rajdeep; Chakravarthi, Bharathi Raja; Fransen, Theodorus; McCrae, John P. (International Committee on Computational Linguistics, 2020-12)
      Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines. Language identification ...