Show simple item record

dc.contributor.advisorHandschuh, Siegfried
dc.contributor.authorThai, VinhTuan
dc.date.accessioned2012-11-05T15:57:15Z
dc.date.available2012-11-05T15:57:15Z
dc.date.issued2012-10-05
dc.identifier.urihttp://hdl.handle.net/10379/3028
dc.description.abstractDespite many technological advances, the information overload problem still prevails in many application areas. It is challenging for users who are inundated with data to explore different facets of a complex information space to extract and put several pieces of facts together into a big picture that allows them to see various aspects of the data. Nevertheless, the availability of data should be embraced, not considered a threat for individuals and businesses alike. As a substantial amount of invaluable information to be explored resides within unstructured text data, there is a need to support users in visual exploration of text collections to obtain useful understandings that can be turned into worthwhile results. In this dissertation, we present our contributions in this area. We propose an approach to support users in exploring collections of text documents based on their interests and knowledge, which are represented by entities within an ontology. This ontology is used to drive the exploration and can be enriched with newly discovered entities matching users' interests in the process. Coordinated multiple views are used to visualize various aspects of text collections in relation to the set of entities of interest to users. To support faceted filtering of a large number of documents, we show how a multi-dimensional visualization can be employed as an alternative to the traditional linear listing of focus items. In this visualization, visual abstraction based on a combination of a conceptual structure and the structural equivalence of documents can be simultaneously used to deal with a large number of items. Furthermore, the approach also enables visual ordering based on the importance of facet values to support prioritized, cross-facet comparisons of focus items. We also report on an approach to support users' comprehension of the distribution of entities within a document based on the classic TileBars paradigm. Our approach employs a simplified version of a matrix reordering technique, which is based on the barycenter heuristic for bigraph edge crossing minimization, to reorder elements of TileBars-based Entities Distribution Views to tackle the visual complexity problem. The resulting reordered views enable users to quickly and easily identify which entities appear in the beginning, the end, or throughout a document. Lastly, our work is also concerned with visual concordance analysis, which supports users in understanding how terms are used within a document by investigating their usage contexts. To abstract away the textual details and yet retain the core facets of a term's contexts for visualization, we employ a statistical topic modeling method to group together words that are thematically related. These groups are used to visualize the gist of a term's usage contexts in a visualization called Context Stamp.en_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectHuman computer interactionen_US
dc.subjectData visualisationen_US
dc.titleVisual Exploration of Text Collectionsen_US
dc.typeThesisen_US
dc.contributor.funderScience Foundation Ireland (SFI)en_US
dc.local.noteThis dissertation aims at helping users explore a large amount of text documents via intelligent user interfaces. The work reported in this thesis combines text analysis methods with data visualization techniques to let users quickly filter for relevant documents, as well as explore and compare the distribution and usage contexts of terms within documents.en_US
dc.local.finalYesen_US
nui.item.downloads298


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland