ARAN - Access to Research at NUI Galway

Visual Exploration of Text Collections

ARAN - Access to Research at NUI Galway

Show simple item record

dc.contributor.advisor Handschuh, Siegfried
dc.contributor.author Thai, VinhTuan
dc.date.accessioned 2012-11-05T15:57:15Z
dc.date.available 2012-11-05T15:57:15Z
dc.date.issued 2012-10-05
dc.identifier.uri http://hdl.handle.net/10379/3028
dc.description.abstract Despite many technological advances, the information overload problem still prevails in many application areas. It is challenging for users who are inundated with data to explore different facets of a complex information space to extract and put several pieces of facts together into a big picture that allows them to see various aspects of the data. Nevertheless, the availability of data should be embraced, not considered a threat for individuals and businesses alike. As a substantial amount of invaluable information to be explored resides within unstructured text data, there is a need to support users in visual exploration of text collections to obtain useful understandings that can be turned into worthwhile results. In this dissertation, we present our contributions in this area. We propose an approach to support users in exploring collections of text documents based on their interests and knowledge, which are represented by entities within an ontology. This ontology is used to drive the exploration and can be enriched with newly discovered entities matching users' interests in the process. Coordinated multiple views are used to visualize various aspects of text collections in relation to the set of entities of interest to users. To support faceted filtering of a large number of documents, we show how a multi-dimensional visualization can be employed as an alternative to the traditional linear listing of focus items. In this visualization, visual abstraction based on a combination of a conceptual structure and the structural equivalence of documents can be simultaneously used to deal with a large number of items. Furthermore, the approach also enables visual ordering based on the importance of facet values to support prioritized, cross-facet comparisons of focus items. We also report on an approach to support users' comprehension of the distribution of entities within a document based on the classic TileBars paradigm. Our approach employs a simplified version of a matrix reordering technique, which is based on the barycenter heuristic for bigraph edge crossing minimization, to reorder elements of TileBars-based Entities Distribution Views to tackle the visual complexity problem. The resulting reordered views enable users to quickly and easily identify which entities appear in the beginning, the end, or throughout a document. Lastly, our work is also concerned with visual concordance analysis, which supports users in understanding how terms are used within a document by investigating their usage contexts. To abstract away the textual details and yet retain the core facets of a term's contexts for visualization, we employ a statistical topic modeling method to group together words that are thematically related. These groups are used to visualize the gist of a term's usage contexts in a visualization called Context Stamp. en_US
dc.subject Human computer interaction en_US
dc.subject Data visualisation en_US
dc.title Visual Exploration of Text Collections en_US
dc.type Thesis en_US
dc.contributor.funder Science Foundation Ireland (SFI) en_US
dc.local.note This dissertation aims at helping users explore a large amount of text documents via intelligent user interfaces. The work reported in this thesis combines text analysis methods with data visualization techniques to let users quickly filter for relevant documents, as well as explore and compare the distribution and usage contexts of terms within documents. en_US
dc.local.final Yes en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record