Show simple item record

dc.contributor.advisorNickles, Matthias
dc.contributor.authorMuñoz, Emir
dc.date.accessioned2020-07-24T08:45:52Z
dc.date.available2020-07-24T08:45:52Z
dc.date.issued2020-07-21
dc.identifier.urihttp://hdl.handle.net/10379/16098
dc.description.abstractKnowledge graphs are graph-structured knowledge bases that have shown to be of great value in many Artificial Intelligence applications in academia and industry alike. They are typically generated automatically from un-/semi- structured data sources. The increasing popularity of knowledge graphs has been limited by multiple challenges given the size and quality of the information they contain. This thesis explores the relationship between the quality of knowledge graphs and machine learning technologies used to discover and extract knowledge from them. We focus on quality in terms of completeness and consistency. Knowledge graphs provide the flexibility required for representing knowledge at different scales in open environments such as the Web. However, their versatility makes them have an ever-changing schema, which also makes them hard to summarize and understand their content. Moreover, they are typically never complete—even in very specific domains—and their consistency with respect to a given schema or ontology cannot be guaranteed without the corresponding validation. That lack of an accurate schema has shown to be problematic in use cases where applications might need to rely on the fact that data satisfy a set of constraints. The contribution of this thesis is twofold. Firstly, we propose a scalable data-driven method to exhibit the actual (latent) shape of graph data. We introduce an algorithm for mining relation cardinality bounds and building so-called shapes that exhibit important aspects of the structure (or topological information) of entities and relations in a knowledge graph. Latent shapes also allow us to formalise an approximate algorithm for validating the structure of knowledge graphs. Secondly, we exploit the latent shapes of entities and relations to enhance the performance of machine learning models aimed to predict missing links and complete knowledge graphs. We use local patterns information and graph-based feature models in the Bioinformatics domain for improving the prediction of adverse drug reactions achieving new state-of-the-art results. Finally, we extend latent feature models by encoding the cardinality of relations as a regularisation term used to learn semantic embeddings that improve the precision of downstream prediction tasks in benchmark datasets.en_IE
dc.publisherNUI Galway
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectknowledge graphsen_IE
dc.subjectsemantic weben_IE
dc.subjectconstraintsen_IE
dc.subjectmachine learningen_IE
dc.subjectrepresentation learningen_IE
dc.subjectEngineering and Informaticsen_IE
dc.titleKnowledge graph mining with latent shape graphsen_IE
dc.typeThesisen
dc.contributor.funderData Science Institute, NUI Galwayen_IE
dc.contributor.funderFujitsu Laboratoriesen_IE
dc.contributor.funderData Science Institute, National University of Ireland Galwayen_IE
dc.local.noteFirst, this thesis studies the discovery of cardinality constraints from knowledge graphs. These constraints are then used to train different machine learning models to replace hand-crafted feature engineering in different knowledge discovery tasks. We also look into how to naturally embed cardinality when learning representations of knowledge graphs.en_IE
dc.local.finalYesen_IE
nui.item.downloads839


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland