Evaluation of Technology Term Recognition with Random Indexing

QasemiZadeh, Behrang; Handschuh, Siegfried

View/Open

technology-term-recognition_Paper.pdf (285.3Kb)

Date

2014

Author

QasemiZadeh, Behrang

Handschuh, Siegfried

Metadata

Show full item record

Usage

This item's downloads: 202 (view details)

Recommended Citation

Behrang QasemiZadeh and Siegfried Handschuh (2014) Evaluation of Technology Term Recognition with Random Indexing Proceedings of the Ninth International Conference on Language Resources and Evaluation

Abstract

In this paper, we propose a method that combines the principles of automatic term recognition and the distributional hypothesis to identify technology terms from a corpus of scientific publications. We employ the random indexing technique to model terms surrounding words, which we call the context window, in a vector space at reduced dimension. The constructed vector space and a set of reference vectors, which represents manually annotated technology terms, in a k-nearest-neighbour voting classification scheme are used for term classification. In this paper, we examine a number of parameters that influence the obtained results. First, we inspect several context configurations, i.e. the effect of the context window size, the direction in which co-occurrence counts are collected, and information about the order of words within the context windows. Second, in the k-nearest-neighbour voting scheme, we study the role that neighbourhood size selection plays, i.e. the value of k. The obtained results are similar to word space models. The performed experiments suggest the best performing context are small (i.e. not wider than 3 words), are extended in both directions and encode the word order information. Moreover, the accomplished experiments suggest that the obtained results, to a great extent, are independent of the value of k.

URI

http://www.lrec-conf.org/proceedings/lrec2014/pdf/920_Paper.pdf
http://hdl.handle.net/10379/4381

Collections

Data Science Institute (Scholarly Articles)

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland