Show simple item record

dc.contributor.advisorHayes, Conor
dc.contributor.authorHulpus, Ioana
dc.description.abstractWe rely more and more on machines to organise, analyse and summarise the vast amount of textual digital information that is being produced at a rate never seen before. At the same time, we notice an increase in availability of structured knowledge that is understandable by both humans and machines. The integration between unstructured text and structured knowledge is crucial for availing of the knowledge contained in text. The research questions that we tackle in this thesis are essential for understanding how applications can effectively link text elements to external background knowledge, and how this background knowledge can assist humans in the interpretation of vast text collections. Towards this goal, this thesis deals primarily with two core problems: word-sense disambiguation and topic labelling. Word-sense disambiguation is a fundamental problem that needs to be dealt with by most systems that need to integrate text and background knowledge. In this thesis, we investigate two scenarios for word-sense disambiguation. The first scenario focuses on disambiguation with multiple sense inventories simultaneously, and has not been addressed before. We tackle this problem by proposing a versatile disambiguation approach that only requires a short textual definition of word senses. The second scenario addresses word-sense disambiguation with a pre-given semantic graph, DBpedia. We propose a new disambiguation algorithm that solely relies on graph proximity for solving this problem. The novelty lies in that no previous work took a semantic graph approach to disambiguation with DBpedia. The second core problem this thesis tackles is topic labelling. Topic labelling is necessary for displaying text mining results in a human interpretable way. Broadly, its goal is to find a phrase that captures the essence (gist) behind a group of related words (topic). Our approach exploits the structure of the semantic graph of DBpedia in order to solve this problem. The unifying high-level hypothesis behind our research is that structural properties of concepts reveal their semantic properties. All our findings show a substantial correspondence between distributional semantics and semantics captured in the structure of semantic networks. This opens new opportunities for integrating the knowledge extracted from text through text mining and background knowledge, as well as for leveraging the benefits of this integration. Throughout this thesis we evaluate our proposed methods through user studies, compare their performances to related work and discuss our findings.en_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.subjectTopic labellingen_US
dc.subjectWord sense disambiguationen_US
dc.subjectSemantic networken_US
dc.subjectGraph analysisen_US
dc.titleSemantic network analysis for unsupervised topic linking and labellingen_US
dc.contributor.funderScience Foundation Irelanden_US
dc.local.noteThis thesis researches methods for automatic recognition of the meaning (semantics) of words, as well as groups of words. The main challenge in this regard is 'polysemy', a property of natural language, whereby words can have different meanings in different contexts. Humans encounter no problem in grasping the contextual meaning of the word, but computers must use complex algorithms, in the process called word-sense disambiguation. Parts of this thesis deal with this process. Similarly, different words refer to the same topic and humans have no problem in identifying this topic. This thesis also researches how this topic identification and labelling can be automatically achieved by machines.en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland