Syllabus

CS 601R: Statistical Natural Language Processing

Winter Semester 2003

Irene Langkilde Geary



Office: 3334 TMCB, Email: irenelg@cs.byu.edu, Phone: 2-3020 , Hours: ???

Objectives:

Prerequisites: some ability to program

Textbook: Foundations of Statistical Natural Language Processing, by Christopher D. Manning and Hinrich Schutze, MIT Press, 1999.

Other Sources:
Jurafsky, Daniel and James H. Martin, Speech and Language Processing, (2000) Prentice-Hall.
http://nlp.standford.edu/fsnlp
http://www.cs.colorado.edu/~martin/slp.html

Grading Policy:
25% Lectures
10% Quizzes
50% Homework
15% Final Project

Last Day of Class: Monday, April 14th
Final Exam Date: Monday, April 21st, 11am-2pm

Tentative Schedule:

Week 1: Intro, linguistic basics (Ch 1&3)
Week 2: Corpora, Perl (Ch 4)
Week 3: Math Foundations (Ch 2)
Week 4: Words (Ch 5&6)

Topics:

Applications:
 machine translation
 dialogue
 information retrieval
 information extraction
 text categorization
 question answering
 summarization
 speech recognition
 speech synthesis

Subtasks:
 tagging
 parsing
 generation
 word sense disambiguation
 alignment
 clustering

Linguistic theory basics:
 morphology
 phonology
 syntax
 semantics
 discourse

Grammar theories
 categorial
 systemic-functional
 meaning-text
 tree-adjoining
 hpsg

Techniques/Models:
 ngrams and HMMs
 FSAs and FSTs
 dynamic programming
 unification

Tools:
 taggers/parsers
 machine learners
 CMU Statistical Language Modeling Toolkit
 Carmel Finite State Transducer

Languages:
LISP
Perl
Prolog