You are here

Named Entity Recognition for Code-Mixed Indian Corpus using Meta Embedding

Authors: 

Ruba Priyadharshini, Bharathi Raja, Mani Vegupatti, John McCrae

Publication Type: 
Refereed Conference Meeting Proceeding
Abstract: 
In this paper, we utilize the pre-trained embedding, sub-word embedding and closely related languages of languages in the code mixed corpus to create a meta-embedding. We then use the Transformer to encode the code mixed sentence and use Conditional Random Field to predict the Named Entities in the code-mixed text. In contrast to classical Named Entity recognition where the text is monolingual, our approach can predict the Named Entities in code-mixed corpus written both in the native script as well as Roman script. Our method is a novel method to combine the embeddings of closely related languages to identify Named Entity from Code-Mixed Indian text written using native script and Roman script in social media.
Conference Name: 
2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS)
Proceedings: 
2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS)
Digital Object Identifer (DOI): 
https://doi.org/10.1109/ICACCS48705.2020.9074379
Publication Date: 
06/03/2020
Pages: 
68-72
Conference Location: 
India
Research Group: 
Institution: 
National University of Ireland, Galway (NUIG)
Open access repository: 
Yes
Publication document: