Please check homeworks and projects for updates. Homeworks are not finalized until the class before they are do. Projects are not final until the date that they are "out."
- August 20: Welcome: History of NLP and modern applications. Slides
- Reading: Chapter 1 of Linguistic Fundamentals for NLP. You should be able to access this PDF for free from a Georgia Tech computer.
- August 22: Supervised learning: bag-of-words models and naive bayes. Notes
- Homework 1 due
- Project 1 out
- Reading: Chapters 0-0.3, 1-1.2 of LXMLS lab guide
- Optional reading: Survey on word sense disambiguation
- August 27: Discriminative classifiers: perceptron and MIRA; word-sense disambiguation. Notes on perceptron; Slides on WSD.
- Reading: Chapters 1.3-1.4 of LXMLS guide
- Reading: Parts 4-7 of log-linear models
- Optional reading: Passive-aggressive learning
- Optional reading: Exponentiated gradient training
- August 29: Logistic regression and unsupervised learning; word sense clustering. Notes on logistic regression and EM.
- Homework 2 due
- Reading: Nigam et al
- Optional reading: Expectation maximization
- Optional readings: Tutorial on EM, Word sense clustering
- September 3: More about EM; Semi-supervised EM; Language models: n-grams, smoothing, speech recognition. Slides on semi-supervised learning.
- Project 1 due
- Reading: Language modeling chapter by Michael Collins
- Optional reading: An empirical study of smoothing techniques for language models, especially sections 2.7 and 3 on Kneser-Ney smoothing.
- Optional reading: A hierarchical Bayesian language model based on Pitman-Yor processes. Requires some machine learning background.
- September 5: Finite state automata, morphology, semirings
- Reading: Knight and May
- September 10: Finite state transduction, edit distance, finite state composition
- Reading: Chapter 2 of Linguistic Fundamentals for NLP.
- Reading: OpenFST slides
- Optional reading: Mohri and Pereira,
- Homework 3 due
- September 12: Sequence labeling 1: part-of-speech tags, hidden Markov models, Viterbi, B-I-O encoding
- Project 2 out
- Reading: Chapter 3 of LXMLS
- Optional reading: Tagging problems and hidden Markov models
- September 17: Sequence labeling 2: discriminative structure prediction, conditional random fields
- Homework 4 due
- Reading: Conditional random fields
- Optional reading: CRF tutorial
- Optional reading: Discriminative training of HMMs
- September 19: Sequence labeling 3: the forward-backward algorithm and unsupervised POS induction
- Reading: Forward-backward
- Optional reading: Two decades of unsupervised POS tagging: how far have we come?
- September 24: Syntax and CFG parsing
- Project 2 due
- Reading: Probabilistic context-free grammars
- September 26: Lexicalized parsing
- Homework 5 due
- Reading: Lexicalized PCFGs
- Optional reading: Accurate unlexicalized parsing
- October 1: Dependency parsing
- Reading: Characterizing the errors of data-driven dependency parsing models
- Optional reading: Eisner algorithm worksheet
- Optional reading: Short textbook on dependency parsing, PDF should be free from a GT computer.
- October 3: Grammar induction and alternative syntactic formalisms
- Homework 6 due
- Reading: The inside-outside algorithm
- Reading: Intro to CCG
- Optional reading: Corpus-based induction of linguistic structure
- Optional reading: Much more about CCG
- Optional reading: Joshi on LTAG
- Optional reading: Probabilistic disambiguation models for wide-coverage HPSG
- October 8: Midterm
- Project 3 out
- October 10: Midterm recap. Semi-supervised learning and domain adaptation.
- Reading: Jerry Zhu's survey
- Optional reading: Way more about semi-supervised learning
- October 11: Drop deadline
- October 15: Fall recess, no class
- October 17: Compositional semantics
- Project 3 due
- Reading: Manning: Intro to Formal Computational Semantics
- Optional reading: Learning to map sentences to logical form;
- October 22: Shallow semantics
- Video: Pereira: Low-pass semantics
- October 24: Distributional semantics
- Homework 7 due
- Project 4 out
- Reading: Vector-space models, sections 1, 2, 4-4.4, 6
- Optional reading: Semantic compositionality through recursive matrix-vector spaces
- Optional reading: Vector-based models of semantic composition
- October 29: Anaphora resolution
- October 31: Coreference resolution
- Homework 8 due
- Reading: Multi-pass sieve
- Optional reading: Large-scale multi-document coreference
- November 5: Discourse structure
- Project 4 due
- Reading: Discourse structure and language technology
- Optional: Modeling local coherence; Sentence-level discourse parsing
- November 7: Dialogue structure
- Homework 9 due
- Reading: TBA
- November 12: Project proposal presentations
- November 14: Information extraction
- Homework 10 due
- Reading: Grishman, sections 1 and 4-6
- November 19: Phrase-based machine translation
- Homework 11 due
- Reading: IBM models 1 and 2
- Optional reading: Statistical machine translation
- November 21: Syntactic machine translation
- Reading: Intro to Synchronous Grammars
- November 26: Multilingual learning
- Reading: Multisource transfer of delexicalized dependency parsers
- Optional reading: Cross-lingual word clusters; Climbing the tower of Babel
- November 28: Thanksgiving, no class
- December 3: Project presentations
- December 5: Project presentations
- Homework 12 due at the end of class