Indiana University Bloomington

School of Informatics and Computing


Computer Science Program







 Home

 Contacts

 Courses

 Academics

 Careers

 Research

 People

 Calendar

 Resources

 Facilities



Pervasive Technology Labs

Computing Research Association

Association for Computing Machinery

Technical Report TR677:
Building a Concept Hierarchy Using Frequent Tag Sequences

Jon Klinginsmith (IUB), Malika Mahoui (IUPUI), Josette Jones (IUPUI), Melanie Wu (IUB)
Unknown Date, 7
[This paper has been submitted to the CIKM 2009 conference. We will not hear on acceptance until late July.]
Abstract:
Web sites that allow collaborative tagging of resources have become a commonplace development. As part of the second generation of applications available on the Web, these sites provide a tremendous amount of user-generated taxonomic information. However, information seekers are hindered by the lack of organization within these tags. To address this issue, several methods have been proposed for creating an organizational structure from the tags. Despite their benefits, the current methods do not directly represent an organization of concepts, as a concept is often composed of more than one tag. In this paper, we propose a new approach to generating a concept hierarchy from the user-generated tags. Exploiting the fact that users often express a concept over a set of sequential tags, we propose a two-step approach for generating a hierarchy of concepts. We first discover concepts through tag sequences with sufficient support. Using these concepts, we then calculate conditional probabilities to discover the existing hierarchical relationships. The key benefit of the hierarchy produced through our approach is that it is topic-based, as opposed to existing related work, which only produce hierarchies of tags. Our findings are illustrated on a domain-specific dataset of tags supplied by a popular collaborative tagging Web site.

Available as:
  • PDF (187 KBytes)

There is help available if you want further information about the available file formats and software to display and print these files.

Return to the Technical Report Index








Valid HTML 4.01!