The Paper :

Interactive Thesaurus Assessment for Automatic Document Annotation

  Authors : Kai Eckert, Heiner Stuckenschmidt, and Magnus Pfeffer
  Presenters : Jaliya Ekanayake and Sribabu Doddapaneni
  Discussants : Brent Castle and Xin Wei
 

<Slides>

 
 
  Abstract : In this paper Kai et al. present there research on an interactive thesaurus assessment framework. They argue that the semantic annotations of documents in terms of keywords from a controlled vocabularies and thesauri are the key to successful search because they solve the problem of using different terms to talk about the same topic. The quality of the annotations significantly depends on the quality of the thesaurus and hence evaluating and improving the thesaurus is an important consideration. Their framework applies statistics mostly related to the notion of information contents of terms in the thesaurus and provide a visualization mechanism for the results generated from the statistical analysis. The visualization helps the user to identify and further investigate potential problems in a thesaurus.


Kai et al. uses their framework to annotate two document collections from dissimilar fields. Frist, 800 abstracts related to medical publications and second, 1000 abstracts from a document collection related to economics. They use two thesauri namely, MeSH and STW, to annotate these abstracts. The evaluation of the thesauri is based on a statistical measure of distance of information content which they derive using two commonly applied concepts vis. Information Content (IC) and Intrinsic Information Content(IIC) After fine tuning the thesauri they measured the precision value comparing the annotations made by the automatic annotation and the manual annotation.


Although their results do not significantly establish the necessity of the thesaurus assessment, the problems they have identified and corrected in thesauri using their approach are significant.

 
  References :
  1. P.Han and Z.Wang and Z.Li and B.Kramer and F.Yang: Substitution or Complement: An Empirical Analysis on the Impact of Collaborative Tagging on Web Search, in Web Intelligence 757-760 (2006)
  2. Oscar Corcho, Ontology based document annotation: trends and open research problems, International Journal of Metadata, Semantics and Ontologies 2006 - Vol. 1, No.1 pp. 47 - 57
  3. Presentation on Information Retrieval (No Author)
  4. Bruno Pouliquen, Ralf Steinberger, Camelia Ignat,Automatic Annotation of Multilingual Text Collections with a Conceptual Thesaurus
  5. Jayashree Kalpathy-Cramer and William Hersh, Medical Image Retrieval and Automatic Annotation: OHSU at ImageCLEF 2007