Indiana University Bloomington

School of Informatics and Computing

Technical Report TR708:
Advanced Keyword Search on Graph

Yifan Pan; Yuqing Wu
(Aug 2013), 9
[A shorter version of this report will be published in CIKM 2013]
Keyword search, due to its simplicity in expressing search demands, has been the major means of search queries for Internet search engines. More recently, the notion has been explored in structured and semi-structured data, where new search semantics were defined, and evaluation algorithms proposed. What is yet to be explored thoroughly in keyword search on those types of graph data is how optional and negative keywords can be expressed, what the results should be and how such search queries can be evaluated efficiently. In this paper, we propose and formally define a new type of keyword search query, ROU-query, which takes as input keywords in three categories: required, optional and unwanted, and returns as output sets of data entries (nodes in the data graph) whose neighborhood satisfy the keyword requirements. We define multiple semantics, including maximal coverage and minimal footprint, to ensure the meaningfulness of the results. We propose a new data structure, query induced partite graph (QuIP), that can capture the constraints related to neighborhood size and unwanted keywords, and propose a family of algorithms that take advantage of the information in QuIP for efficient evaluation of ROU-queries. We conducted extensive experimental evaluations to analyze the performance of our proposed algorithms and found that the size of QuIP is generally very small compared to the data graph, and our algorithms are able to effectively prune non-promising partial results and generate results for ROU-queries efficiently.

4. Please see attachment.

Available as: