WorkStudy Spring 2006

Student: Arif Emre Caglar

Blog: Word Similarity

Schedule:

Week 1: (March 10)

  1. Lexical semantics – Regina Barzilay, Lec 7

  2. Learning similarity from corpora – Regina Barzilay, Lec 8


Reports: Lexical semantics, answers to questions


Week 2: (March 17)

  1. WordNet

  2. Probability Refresh Notes - http://ais.ku.edu.tr/course/8566/prob17.pdf

  3. Statistical Learning Methods: http://aima.cs.berkeley.edu/newchap20.pdf - EM Algorithm

  4. N-gram notes – Speech and Language Understanding – Chapter 4 – Ngrams

  5. Introduction to Computational Linguistics – wk05


Reports: N-grams

Week 3: (March 24)

  1. Chapters 2,3, 9 - Statistical Language Learning – Eugene Charniak

  2. Word Vectors and Search Engines - infomap

  3. Automatic Retrieval and Clustering of Similar Words – Dekang Lin, acl98

  4. DomainTextStructure.pdf

  5. Class-Based n-gram Models of Natural Language – Brown et. al.

  6. Infomap and SRILM

  7. Project Ideas due

Report: Vectors, DekangLin Summary, Charniak

Week 4:

  1. Probabilistic Language Modeling – Regina Barzilay, Lec03.pdf

  2. Infomap practice

  3. TREC Conference Proceedings: check QA, terabyte, blog track publications.

  4. Lexical Attraction Models of Language – Deniz Yuret, clpaper.ps

  5. A Model of Lexical Attraction and Repulsion – Beeferman et. al.

  6. Corpus linguistics – wikipedia

  7. Hidden markov model – wikipedia

Report:

Week 5:

  1. Chapter 2 - Manning and Schutze

  2. A Variable-Length Category-Based N-Gram Language Model – Niesler and Woodland

  3. Chapter 2 - Manning and

Report: