Linked data with access control
MetadataShow full item record
This item's downloads: 4215 (view details)
The explosion of digital content and the heterogeneity of enterprise content sources have pushed existing data integration solutions to their boundaries. An alternative solution is to use the Resource Description Framework (RDF) together with the existing web infrastructure, commonly known as the Linked Data Web (LDW), as a means to integrate both public and private data. With the advent of SPARQL 1.1, it is possible not only to execute queries over the LDW, but also to use the SPARQL query language to maintain distributed graph data. However, such a decentralised architecture brings with it a number of additional challenges with respect to both data security and integrity. In this thesis, we focus on the problem of access control for the LDW. We are particularly interested in lifting both the data and the access control policies from existing line of business applications and enforcing and maintaining access control over linked data, irrespective of how it is published. We start by proposing a lifting strategy, which can be used to extract both data and access control policies from relational databases. The access permissions are represented using an extension of the RDF model known as annotated RDF (aRDF), which allows contextual information to be associated with RDF data. By using aRDF domain operations it is possible to combine different annotations for the same triple and to infer new annotations based on RDF Schema (RDFS) inference rules. We demonstrate how the proposed modelling, together with a set of inference rules, can be used to provide support for commonly used access control models, such as Role-Based Access Control, Attribute Based Access Control and View Based Access Control. With regards to the enforcement and administration of access control over RDF, we focus specifically on Discretionary Access Control (DAC). Given DAC allows users to delegate their permissions to others, it is particularly suitable for managing access control over distributed data. Although a number of authors demonstrate how they can used Semantic Web technology to represent DAC policies, the authors do not examine DAC from a data model perspective. In order to fill this gap, we provide a summary of access control requirements for the RDF data model, based on the different characteristics of the RDF data model compared to relational and tree data models. In order to support access control policy specification at the triple, resource, graph and schema level, authorisations are specified using graph patterns. In addition, we demonstrate how our proposed flexible graph based authorisation framework, which we call GFAF, can be used to cater for the specification, administration and enforcement of DAC policies over linked data. Given the proposed access control framework enforces access control at the query layer, when access to the requested RDF data is partially restricted, it is necessary to rewrite the query so that it behaves in the same manner as a query executed over a filtered dataset. Therefore, we propose a query rewriting strategy for both the SPARQL 1.1 query and update languages, that can be used to partially restrict access to unauthorised data. In addition, we demonstrate how a set of criteria, which was originally used to verify relational access control policies, can be adapted to ensure the correctness of access control over RDF via query rewriting.