SemanTex: semantic text exploration using document links implied by conceptual networks extracted from the texts
MetadataShow full item record
This item's downloads: 166 (view details)
Suad Aldarra, Emir Muñoz, Pierre-Yves Vandenbussche, and Vít Nováček. 2014. SemanTex: semantic text exploration using document links implied by conceptual networks extracted from the text. In Proceedings of the 2014 International Conference on Posters & Demonstrations Track - Volume 1272 (ISWC-PD'14), Matthew Horridge, Marco Rospocher, and Jacco Van Ossenbruggen (Eds.), Vol. 1272. CEUR-WS.org, Aachen, Germany, Germany, 345-348.
Despite of advances in digital document processing, exploration of implicit relationships within large amounts of textual resources can still be daunting. This is partly due to the ‘black-box’ nature of most current methods for computing links (i.e., similarities) between documents (c.f.,  and ). The methods are mostly based on numeric computational models like vector spaces or probabilistic classifiers. Such models may perform well according to standard IR evaluation methodologies, but can be sub-optimal in applications aimed at end users due to the difficulties in interpreting the results and their provenance [3, 1]. Our Semantic Text Exploration prototype (abbreviated as SemanTex) aims at finding implicit links within a corpus of textual resources (such as articles or web pages) and exposing them to users in an intuitive front-end. We discover the links by: (1) finding concepts that are important in the corpus; (2) computing relationships between the concepts; (3) using the relationships for finding links between the texts. The links are annotated with the concepts from which the particular connection was computed. Apart of being presented to human users for manual exploration in the SemanTex interfaces, we are working on representing the semantically annotated links between textual documents in RDF and exposing the resulting datasets for particular domains (such as PubMed or New York Times articles) as a part of the Linked Open Data cloud.
The following license files are associated with this item: