Triplifying Wikipedia's tables

View/ Open
Date
2013Author
Muñoz, Emir
Hogan, Aidan
Mileo, Alessandra
Metadata
Show full item recordUsage
This item's downloads: 154 (view details)
Recommended Citation
MUÑOZ, Emir, HOGAN, Aidan and MILEO, Alessandra, 2013, Triplifying Wikipedia’s Tables. In : Proceedings of the Linked Data for Information Extraction Workshop (LD4IE 2013) at ISWC 2013. Sydney, Australia : CEUR-WS.org. 2013. CEUR Workshop Proceedings
Published Version
Abstract
We are currently investigating methods to triplify the content of Wikipedia's tables. We propose that existing knowledge-bases can be leveraged to semi-automatically extract high-quality facts (in the form of RDF triples) from tables embedded in Wikipedia articles (henceforth called \Wikitables"). We present a survey of Wikitables and their content in a recent dump of Wikipedia. We then discuss some ongoing work on using DBpedia to mine novel RDF triples from these tables: we present methods that automatically extract 24.4 million raw triples from the Wikitables at an estimated precision of 52.2%. We believe this precision can be (greatly) improved through machine learning methods and sketch ideas for features that should help classify (in)correct triples.