Show simple item record

dc.contributor.authorMuñoz, Emir
dc.contributor.authorHogan, Aidan
dc.contributor.authorMileo, Alessandra
dc.date.accessioned2016-09-14T13:12:03Z
dc.date.available2016-09-14T13:12:03Z
dc.date.issued2014
dc.identifier.citationEmir Muñoz, Aidan Hogan, and Alessandra Mileo. 2014. Using linked data to mine RDF from wikipedia's tables. In Proceedings of the 7th ACM international conference on Web search and data mining (WSDM '14). ACM, New York, NY, USA, 533-542. DOI=http://dx.doi.org/10.1145/2556195.2556266en_IE
dc.identifier.urihttp://hdl.handle.net/10379/6016
dc.description.abstractThe tables embedded in Wikipedia articles contain rich, semi-structured encyclopaedic content. However, the cumulative content of these tables cannot be queried against. We thus propose methods to recover the semantics of Wikipedia tables and, in particular, to extract facts from them in the form of RDF triples. Our core method uses an existing Linked Data knowledge-base to find pre-existing relations between entities in Wikipedia tables, suggesting the same relations as holding for other entities in analogous columns on different rows. We find that such an approach extracts RDF triples from Wikipedia's tables at a raw precision of 40%. To improve the raw precision, we define a set of features for extracted triples that are tracked during the extraction phase. Using a manually labelled gold standard, we then test a variety of machine learning methods for classifying correct/incorrect triples. One such method extracts 7.9 million unique and novel RDF triples from over one million Wikipedia tables at an estimated precision of 81.5%.en_IE
dc.description.sponsorshipThis work was supported in part by Fujitsu (Ireland) Ltd., by the Millennium Nucleus Center for Semantic Web Research under Grant NC120004, and by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289en_IE
dc.formatapplication/pdfen_IE
dc.language.isoenen_IE
dc.publisherACMen_IE
dc.relation.ispartofWSDMen
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectLinked dataen_IE
dc.subjectWeb tablesen_IE
dc.subjectWikipediaen_IE
dc.subjectData miningen_IE
dc.subjectData analytics
dc.titleUsing linked data to mine RDF from wikipedia's tablesen_IE
dc.date.updated2016-09-13T13:13:50Z
dc.identifier.doi10.1145/2556195.2556266
dc.local.publishedsourcehttp://dx.doi.org/10.1145/2556195.2556266en_IE
dc.description.peer-reviewedpeer-reviewed
dc.contributor.funder|~|1267880|~|1267883|~|
dc.internal.rssid11398938
dc.local.contactEmir Munoz, Deri, Ida Business Park, Lower Dangan, Nui Galway. - Email: e.munoz1@nuigalway.ie
dc.local.copyrightcheckedYes
dc.local.versionACCEPTED
nui.item.downloads521


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland