Indiana University Bloomington

School of Informatics and Computing

Technical Report TR683:
XML-Based RDF Data Management for Efficient Query Processing

Mo Zhou, Yuqing Wu
(Apr 2010), 6 Pages
[It is expected to be published in WebDB 2010]
The Semantic Web, which represents a web of knowledge, offers us new opportunities to search for knowledge and information. To harvest such search power requires robust and scalable data repositories that can store RDF data and support efficient evaluation of SPARQL queries. Most of the existing RDF storage techniques rely on relation model and relational database technologies for these tasks. They either keep the RDF data as triples, or decompose it into multiple relations. The mis-match between the graph model of the RDF data and the rigid 2D tables of relational model jeopardizes the scalability of such repositories and frequently renders a repository inefficient for some types of data and queries. We propose to decompose RDF graph into a forest of XML trees, store them in an XML repository and rewrite SPARQL queries into XPath/XQuery queries to be evaluated in the XML repository. In this paper, we discuss the basic idea of RDF-to-XML decomposition, the criteria of such decomposition in term of correctness, redundancy and query efficiency. Then, based on these criteria, we propose two algorithms for decomposing RDF data into XML trees. Our experimental evaluation results illustrate that our approach is capable of improving both the storage efficiency and query processing efficiency compared to existing RDF techniques.

Available as: