Identifying, annotating, and filtering arguments and opinions in open collaboration systems
MetadataShow full item record
This item's downloads: 18663 (view details)
The World Wide Web enables large-scale collaboration, even between groups of individuals previously unknown to one another. These collaborations produce tangible outputs, such as encyclopedias (Wikipedia), electronic books (Distributed Proofreaders), maps (OpenStreetMap) and open source software packages (Firefox). In such open collaborations systems, decisions are made through open online discussions, based on the written arguments and opinions that individuals contribute, sometimes in large volumes. Sense-making and coordination is an important component of collaboration, but it is particularly challenging when individuals disagree. When large volumes of opinions and arguments are expressed, coarse approaches such as sampling, sentiment, or voting can help reveal the most popular or emotive choices. But these approaches do not identify the reasons for disagreement, which may be needed in order to reach decisions. For example, about 500 discussions each week in Wikipedia concern whether a particular topic should be covered in the encyclopedia. These discussions may involve comments from 2--200 people, and some topics are contentious. This thesis addresses the problem of analyzing, integrating, and reconciling arguments and opinions in goal-oriented online discussions. The thesis addresses the following three research questions: 1. What are the opportunities and requirements for providing argumentation support? 2. Which arguments are used in open collaboration systems? 3. How can we structure and display opinions and arguments to support filtering? In the thesis, we provide a novel procedure for supporting human reasoning over argumentative discussions. Our procedure has four phases: Selection & Requirements Analysis, Categorization, Structuring & Prototyping, and Evaluation. 1. Selection & Requirements Analysis consists of selecting a community of interest, characterizing the argumentation support needs, and choosing a sample corpus. 2. Categorization means categorizing the sample iteratively based on argumentation theories, validating the coding, and choosing which categorization scheme best matches the requirements. 3. Structuring & Prototyping consists of devising an ontology (based on the requirements and categories from the previous two phases), structuring the data according to the ontology, then deploying a new ontology-based interface for task-based support of human reasoning. 4. Evaluation demonstrates the utility of the prototype and generates ideas for improving it. Our procedure combines ethnography, iterative annotation, ontology development, and user-based evaluation to develop and test a task-based argumentation support system. The novelty of our procedure is its combination of Semantic Web application development with human-centered interaction design methodologies. We apply our procedure to information quality assurance discussions on Wikipedia, the world's sixth most popular website. Information quality assurance is collective, crowd-work in Wikipedia, undertaken by groups of self-nominated individuals: anyone can contribute arguments to ongoing discussions that determine what content is deemed inappropriate and deleted from the collaboratively-written encyclopedia. We show how generic features of open collaboration systems (e.g. policies and frequent newcomers) impact the content deletion process; thus our work has implications for understanding content management procedures and collective discussions on other open collaboration systems. We develop a community-validated description of the workflow of Wikipedia's content deletion discussions, which helps us characterize the argumentation support needs of our case study. By reading and interpreting documents and discussions, contributing to discussions as a participant-observer, and interviewing participants, we identify the three key argumentative tasks to be supported. These argumentative tasks for making collective decisions--determining one's own opinion, commenting according to community standards, and finding the consensus of a discussion--are applicable in any open collaboration system. For structuring arguments from the Social Web, we contribute a concrete use case of argumentation. We determine the most common arguments given in Wikipedia's information quality assurance discussions. Existing generic patterns used for the emerging World Wide Argument Web have shortcomings for task-based argumentation work. Consequently, we develop community-specific decision factors using grounded theory. The arguments are well-represented by just four decision factors: Notability, Sources, Maintenance, and Bias. Together, these four factors completely describe 70% of discussions and over 90% of comments. We find that these decision factors are appropriate for two of our three argumentative tasks: determining one's own opinion and finding the consensus of a discussion. Our work contributes a novel corpus structured with Walton's argumentation schemes and may be the first application of the argumentation theory of factors outside the legal domain. We also develop an ontology for informal argumentation in Wikipedia deletion discussions, and use it to create a task-based interface that supports consensus-finding for deletion discussions in the English-language Wikipedia. In a user-based evaluation, our interface provides statistically significant improvements over the native Wikipedia discussion interface in terms of perceived usefulness, perceived ease of use, and information completeness. In our pilot study, 16 of 19 participants (84%) preferred our argumentation support interface over the native Wikipedia discussion interface.