Evaluation of Technology Term Recognition with Random Indexing
Date
2014Author
QasemiZadeh, Behrang
Handschuh, Siegfried
Metadata
Show full item recordUsage
This item's downloads: 202 (view details)
Recommended Citation
Behrang QasemiZadeh and Siegfried Handschuh (2014) Evaluation of Technology Term Recognition with Random Indexing Proceedings of the Ninth International Conference on Language Resources and Evaluation
Abstract
In this paper, we propose a method that combines the principles of
automatic term recognition and the distributional hypothesis to identify
technology terms from a corpus of scientific publications. We employ
the random indexing technique to model terms surrounding words, which we
call the context window, in a vector space at reduced dimension. The
constructed vector space and a set of reference vectors, which
represents manually annotated technology terms, in a k-nearest-neighbour
voting classification scheme are used for term classification. In this
paper, we examine a number of parameters that influence the obtained
results. First, we inspect several context configurations, i.e. the
effect of the context window size, the direction in which co-occurrence
counts are collected, and information about the order of words within
the context windows. Second, in the k-nearest-neighbour voting scheme,
we study the role that neighbourhood size selection plays, i.e. the
value of k. The obtained results are similar to word space models. The
performed experiments suggest the best performing context are small
(i.e. not wider than 3 words), are extended in both directions and
encode the word order information. Moreover, the accomplished
experiments suggest that the obtained results, to a great extent, are
independent of the value of k.
URI
http://www.lrec-conf.org/proceedings/lrec2014/pdf/920_Paper.pdfhttp://hdl.handle.net/10379/4381