Investigating Context Parameters in Technology Term Recognition
MetadataShow full item record
This item's downloads: 358 (view details)
QasemiZadeh, Behrang; Handschuh, siegfried; (2014) Investigating Context Parameters in Technology Term Recognition . In: Adam Meyers, Yifan He and Ralph Grishman eds. COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language Dublin, Iralnd, 2014-08-24- 2014-08-24
We propose and evaluate the task of technology term recognition: a method to extract technology terms at a synchronic level from a corpus of scientific publications. The proposed method is built on the principles of terminology extraction and distributional semantics. It is realized as a regression task in a vector space model. In this method, candidate terms are first extracted from text. Subsequently, using the random indexing technique, the extracted candidate terms are represented as vectors in a Euclidean vector space of reduced dimensionality. These vectors are derived from the frequency of co-occurrences of candidate terms and words in windows of text surrounding candidate terms in the input corpus (context window). The constructed vector space and a set of manually tagged technology terms (reference vectors) in a k-nearest neighbours regression framework is then used to identify terms that signify technology concepts. We examine a number of factors that play roles in the performance of the proposed method, i.e. the configuration of context windows, neighborhood size (k) selection, and reference vector size.