WorkStudy Spring 2006
Student: Arif Emre Caglar
Blog: Word Similarity
Schedule:
Week 1: (March 10)
Lexical semantics – Regina Barzilay, Lec 7
Learning similarity from corpora – Regina Barzilay, Lec 8
Reports: Lexical semantics, answers to questions
Week 2: (March 17)
WordNet
Probability Refresh Notes - http://ais.ku.edu.tr/course/8566/prob17.pdf
Statistical Learning Methods: http://aima.cs.berkeley.edu/newchap20.pdf - EM Algorithm
N-gram notes – Speech and Language Understanding – Chapter 4 – Ngrams
Reports: N-grams
Week 3: (March 24)
Chapters 2,3, 9 - Statistical Language Learning – Eugene Charniak
Word Vectors and Search Engines - infomap
Automatic Retrieval and Clustering of Similar Words – Dekang Lin, acl98
Class-Based n-gram Models of Natural Language – Brown et. al.
Infomap and SRILM
Project Ideas due
Report: Vectors, DekangLin Summary, Charniak
Week 4:
Probabilistic Language Modeling – Regina Barzilay, Lec03.pdf
Infomap practice
TREC Conference Proceedings: check QA, terabyte, blog track publications.
Lexical Attraction Models of Language – Deniz Yuret, clpaper.ps
A Model of Lexical Attraction and Repulsion – Beeferman et. al.
Corpus linguistics – wikipedia
Hidden markov model – wikipedia
Report:
Week 5:
Chapter 2 - Manning and Schutze
A Variable-Length Category-Based N-Gram Language Model – Niesler and Woodland
Chapter 2 - Manning and
Report: