Indiana University Bloomington

School of Informatics and Computing

Technical Report TR673:
Workload-aware Trie Indexes for XML

Sofia Brenes, Yuqing Melanie Wu, Hyungdae Yi
(Jan 2009), 13 pages
[Submitted to conference for review]
Abstract:
Well-designed indexes can dramatically improve query performance. In the context of XML, structural indexes have proven to be particularly effective in supporting efficient XPath queries - the core of all XML queries, by capturing the structural correlation between data components in an XML document. The duality of space and performance is an inevitable trade-off at the core of index design. It has been established that query workload can be leveraged to balance this trade-off and maximize the throughput of a group of queries. In this paper, we propose a family of novel workload-aware indexes by taking advantage of the recently proposed Trie indexes for XML. In particular, we propose the WP[k]-Trie, the AWP[k]-Trie, and the W[k]-Trie indexes, which use the P[k]-Trie framework to index frequent label-paths and a carefully selected complimentary set of label-paths. When a WP[k]-Trie index is available, all frequent path queries are guaranteed to be evaluated in one index lookup, and all core XPath queries are guaranteed to be evaluated with index-only plans. With further consideration of the representativeness of label-paths in the index and proper annotations, the AWP[k] and W[k]-Trie indexes are able to improve query evaluation performance by efficiently singling out queries with empty results and enabling more efficient query decompositions, with the W[k]-Trie minimizing the space requirements of the index.

Available as: