Alejandro Valerio - Home Page

Alejandro Valerio

Ph.D. candidate
Computer Science Department
Indiana University

Office: 1-812-855-8702 (Lindley Hall 406)
Email: click to send message


Research

Information Extraction for Concept Map Knowledge Modeling

Working with Dr. David Leake on adapting and designing information extraction procedures to generate concept maps from documents automatically. Concept mapping is a knowledge representation tool designed for humans to organize and share information about a topic and it is not intended for formal description of concepts (as in an ontology). Our work studies how to adapt existing techniques to process documents with these requirements.

These procedures can be used primarily to migrate sets of documents into preliminary concept maps, to facilitate the integration of the documents' information into knowledge models. They can also be used to suggest new content to users or to find associations of documents to knowledge models.

This work is in the context of the Integrated Intelligent Support for Knowledge Capture, Refinement and Sharing project. The group at IU works together with the Institute for Human and Machine Cognition (IHMC) and the IHMC's CmapTools Project. The project is led by David Leake and Alberto Cañas. At IU, Ana Maguitman and Thomas Reichherzer have made significant contributions to this work.

I'm part of the NaN group, where we discuss our work periodically.

Online demo of the system to come.


Current Projects

Information Extraction for Concept Map Knowledge Modeling

description

Sensor Agent Execution Environment - SAG3E

Together with Dr. Rick McMullen at the Knowledge Acquisition and Projection Lab, we designed an environment to implement, execute, and monitor processing agents in sensor networks. A set of interconnected agents process and respond to data streams produced by sensors distributed across wide area networks. Each agent performs specific operations on its input stream and produces output data that may be processed in turn by other agents. These data streams are formatted in XML, which simplifies the interface with other systems. Agents can be managed by different authors and groups, and this permits sharing sensoring information in a non-intrusive way.


Past Projects

Opinion extraction from blogs on TREC

Participated with the WIDIT lab group led by Dr. Kiduk Yang at the School of Library and Information Science on the Blog Track opinion extraction task on TREC. The solution combines several modules to identify opinionated blog messages using fusion. The "Adjective-Verb" module identifies opinions using the density of subjective adjectives and verbs in the messages. During training, the module collects these subjective terms by comparing an initial seedset with terms on preclassified data. The comparison is made using distributional similarity and considers the syntactic role of words in sentences.

The system achieved the best performance in the blog retrieval task in TREC 2006.

Netflix Challenge

Participated with Matt Whitehead on the Netflix Prize. The goal of this challenge was to develop an automatic method to accurately predict the ratings that Netflix users will assign to the movies they watch, based on historical data of user-movie ratings.

Python re-implementation of AutoSlog-TS to process news feeds

Python re-implementation of AutoSlog-TS, which was used to extract relevant factoids from everyday news.

Lexical references for CmapTools

Integration of lexical references for assisting concept map construction. Institute for Human and Machine Cognition, IHMC's CmapTools Project.

CBR to support distributed collaboration

Conversational CBR System for Supporting Distributed Collaboration. Knowledge Acquisition and Projection Lab, Pervasive Labs at Indiana University.


Areas of interest

Information Extraction, Information Retrieval, Natural Language Processing, Machine Learning, Knowledge Representation, Web Mining.

Also: Software Engineering, Programming Languages, Programming Languages Implementation, Case-Based Reasoning.


Papers

A. Valerio, D. Leake, and A. J. Cañas. Automatically Associating Documents with Concept Map Knowledge Models. In Proceedings of the 33rd Latin American Conference in Informatics (CLEI 07), 2007.

K. Yang, N. Yu, A. Valerio, H. Zhang, and W. Ke. Fusion Approach to Finding Opinions in Blogosphere. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM 07), 2007.
(best paper nominee)

K. Yang, N. Yu, A. Valerio, and H. Zhang. WIDIT in TREC Blog Track Opinion Task. In Proceedings of the 15th Text Retrieval Conference (TREC 2006), 2006.

A. Valerio and D. Leake. Jump-Starting Concept Map Construction with Knowledge Extracted From Documents. In Proceedings of the Second International Conference on Concept Mapping (CMC 2006), volume 1, pages 296-303, 2006.

S. Brenes and A. Valerio. Case Based Concept Map Topology Counselor. In Proceedings of the Second International Conference on Concept Mapping (CMC 2006), volume 2, pages 54-57, 2006.

D. Leake, S. Bogaerts, M. Evans, R. McMullen, M. Oder, and A. Valerio. Using Cases to Support Divergent Roles in Distributed Collaboration. In 18th International Florida Artificial Intelligence Research Society Conference (FLAIRS 2005), 2005.

A. J. Cañas, A. Valerio, J. Lalinde-Pulido, M. Carvalho, and M. Arguedas. Using WordNet for Word Sense Disambiguation to Support Concept Map Construction. In Proceedings of the 10th International String Processing and Information Retrieval Symposium (SPIRE 2003), pages 350-359, 2003. (extended version)


Bio

Academic History

Professional Experience

Other


Personal

Personal pictures   Blog

Pictures from Costa Rica

(Page banner is a picture of Lago Chirripó, a Guanacaste tree, frog from a rain forest, and a satellite photo of Golfo de Nicoya)

Jazz and instrumental Costa Rican music:
Grupo Malpaís
Editus
Manuel Obregón
Trombones de Costa Rica

If you haven't yet, check the PhD comics.


Last modification: 8-Oct-2007