Mining cardinalities from knowledge bases
View/ Open
Date
2017-08-01Author
Muñoz, Emir
Nickles, Matthias
Metadata
Show full item recordUsage
This item's downloads: 458 (view details)
Cited 5 times in Scopus (view citations)
Recommended Citation
Muñoz, Emir, & Nickles, Matthias. (2017). Mining Cardinalities from Knowledge Bases. In Djamal Benslimane, Ernesto Damiani, William I. Grosky, Abdelkader Hameurlain, Amit Sheth & Roland R. Wagner (Eds.), Database and Expert Systems Applications: 28th International Conference, DEXA 2017, Lyon, France, August 28-31, 2017, Proceedings, Part I (pp. 447-462). Cham: Springer International Publishing.
Published Version
Abstract
Cardinality is an important structural aspect of data that has not received enough attention in the context of RDF knowledge bases (KBs). Information about cardinalities can be useful for data users and knowledge engineers when writing queries, reusing or engineering KBs. Such cardinalities can be declared using OWL and RDF constraint languages as constraints on the usage of properties over instance data. However, their declaration is optional and consistency with the instance data is not ensured. In this paper, we address the problem of mining cardinality bounds for properties to discover structural characteristics of KBs, and use these bounds to assess completeness. Because KBs are incomplete and error-prone, we apply statistical methods for filtering property usage and for finding accurate and robust patterns. Accuracy of the cardinality patterns is ensured by properly handling equality axioms (owl:sameAs); and robustness by filtering outliers. We report an implementation of our algorithm with two variants using SPARQL 1.1 and Apache Spark, and their evaluation on real-world and synthetic data.