Show simple item record

dc.contributor.authorQasemiZadeh, Behrang
dc.contributor.authorHandschuh, Siegfried
dc.contributor.editorPatrick Drouin and Natalia Grabar and Thierry Hamon and Kyo Kageura
dc.date.accessioned2014-08-13T09:14:20Z
dc.date.available2014-08-13T09:14:20Z
dc.date.issued2014
dc.identifier.citationQasemiZadeh, Behrang; Handschuh, Siegfried; (2014) The ACL RD-TEC: A Dataset for Benchmarking Terminology Extraction and Classification in Computational Linguistics . In: Patrick Drouin and Natalia Grabar and Thierry Hamon and Kyo Kageura eds. COLING 2014: 4th International Workshop on Computational Terminology Dublin, Ireland, 2014-08-23- 2014-08-23en_US
dc.identifier.urihttp://www.aclweb.org/anthology/W14-4807
dc.identifier.urihttp://hdl.handle.net/10379/4489
dc.descriptionConference paperen_US
dc.description.abstractThis paper introduces ACL RD-TEC: a dataset for evaluating the extraction and classification of terms from literature in the domain of computational linguistics. The dataset is derived from the Association for Computational Linguistics anthology reference corpus (ACL ARC). In its first release, the ACL RD-TEC consists of automatically segmented, part-of-speech-tagged ACL ARC documents, three lists of candidate terms, and more than 82,000 manually annotated terms. The annotated terms are marked as either valid or invalid, and valid terms are further classified as technology and non-technology terms. Technology terms signify methods, algorithms, and solutions in computational linguistics. The paper describes the dataset and reports the relevant statistics. We hope the step described in this paper encourages a collaborative effort towards building a full-fledged annotated corpus from the computational linguistics literature.en_US
dc.formatapplication/pdfen_US
dc.language.isoenen_US
dc.relation.ispartofCOLING 2014: 4th International Workshop on Computational Terminologyen
dc.subjectterminology evaluationen_US
dc.subjectterm recognitionen_US
dc.subjectterm classificationen_US
dc.subjectterminology evaluation dataseten_US
dc.subjectautomatic term recognitionen_US
dc.subjectnatural language processingen_US
dc.subjectcomputational linguisticsen_US
dc.titleThe ACL RD-TEC: A Dataset for Benchmarking Terminology Extraction and Classification in Computational Linguisticsen_US
dc.typeConference Paperen_US
dc.date.updated2014-08-11T12:38:40Z
dc.local.publishedsourcehttp://www.aclweb.org/anthology/W14-4807en_US
dc.description.peer-reviewedpeer-reviewed
dc.contributor.funder|~|SFI|~|
dc.internal.rssid6873488
dc.local.contactBehrang Qasemizadeh, Deri, Ida Business Park, Lower Dangan, Nui Galway. Email: behrang.qasemizadeh@deri.org
dc.local.copyrightcheckedYes
dc.local.versionPUBLISHED
nui.item.downloads537


Files in this item

Attribution-NonCommercial-NoDerivs 3.0 Ireland
This item is available under the Attribution-NonCommercial-NoDerivs 3.0 Ireland. No item may be reproduced for commercial purposes. Please refer to the publisher's URL where this is made available, or to notes contained in the item itself. Other terms may apply.

The following license files are associated with this item:

Thumbnail

This item appears in the following Collection(s)

Show simple item record