Developing a Dataset for Technology Structure Mining
View/ Open
Date
2010Author
QasemiZadeh, Behrang
Buitelaar, Paul
Monaghan, Fergal
Metadata
Show full item recordUsage
This item's downloads: 331 (view details)
Cited 3 times in Scopus (view citations)
Recommended Citation
Qasemizadeh, Behrang; Buitelaar, Paul; Monaghan, Fergal (2010) Developing a Dataset for Technology Structure Mining. Conference Paper
Published Version
Abstract
This paper describes steps that have been taken to construct a
development dataset for the task of Technology Structure Mining. We have
defined the proposed task as the process of mapping a scientific corpus
into a labeled digraph named a Technology Structure Graph as described
in the paper. The generated graph expresses the domain semantics in
terms of interdependencies between pairs of technologies that are named
(introduced) in the target scientific corpus. The dataset comprises a
set of sentences extracted from the ACL Anthology Corpus. Each sentence
is annotated with at least two technologies in the domain of Human
Language Technology and the interdependence between them. The
annotations - technology mark-up and their interdependencies - are
expressed at two layers: lexical and termino-conceptual. Lexical
representation of technologies comprises varying lexicalizations of a
technology. However, at the termino-conceptual layer all these lexical
variations refer to the same concept. We have adopted the same approach
for representing Semantic Relations, at the lexical layer a semantic
relation is a predicate i.e. defined based on the sentence surface
structure, however at the termino-conceptual layer semantic relations
are classified into conceptual relations either taxonomic or
non-taxonomic. Morover, the contexts that interdependencies are
extracted from are classified into five groups based on the linguistic
criteria and syntactic structure that are identified by the human
annotators. The dataset initially comprises of 482 sentences. We hope
this effort results in a benchmark that can be used for the technology
structure mining task as defined in the paper.
Description
Conference paper
URI
http://www.deri.ie/sites/default/files/publications/bq_ieee_icsc_bare_conf.pdfhttp://hdl.handle.net/10379/4514