B651: Spring 2012
Natural Language Processing
Indiana University Computer Science Program
This course will examine a selection of topics in the field of natural language processing (computational linguistics). We will deal with analysis, generation, and translation at the level of word, sentence, and discourse and will consider both statistical and knowledge-based models. There will also be a brief introduction to spoken language technology.
Students should normally have taken at least one course in artificial intelligence or computational linguistics.
We will make considerable use, both in lectures and in homework assignments, of the Natural Language Toolkit, a set of open source Python modules, linguistic data, and extensive documentation. If you are new to Python, chapters 1 and 4 in the Natural Language Toolkit Book provide a good introduction. Note that the NLTK modules are written in Python 2 and will not run in Python 3.
Course readings will come from the Natural Language Toolkit Book (Bird, Klein, Loper, 2009), overview papers, and recent articles from journals and conference proceedings. All readings will be available online. There will be remedial lectures and lecture notes on linguistic topics for students with no linguistics background as well as on topics from computational linguistics.