You are here

Identifying Poorly-Defined Concepts in WordNet with Graph Metrics

Publication Type: 
Refereed Conference Meeting Proceeding
Princeton WordNet is the most widely-used lexical resource in natural language processing and continues to provide a gold standard model of semantics. However, there are still significant quality issues with the resource and these affect the performance of all NLP systems built on this resource. One major issue is that many nodes are insufficiently defined and new links need to be added to increase performance in NLP. We combine the use of graph-based metrics with measures of ambiguity in order to predict which synsets are difficult for word sense disambiguation, a major NLP task, which is dependent on good lexical information. We show that this method allows use to find poorly defined nodes with a 89.9% precision, which would assist manual annotators to focus on improving the most in-need parts of the WordNet graph
Conference Name: 
Proceedings of the First Workshop on Knowledge Extraction and Knowledge Integration (KEKI-2016)
Digital Object Identifer (DOI): 
Publication Date: 
Conference Location: 
Research Group: 
National University of Ireland, Galway (NUIG)
Open access repository: 
Publication document: