Phrase Attachment
Structural Ambiguity in NP / VP / PP
Topics 1 – 2
01
Topic
Theory + Code
Phrase Attachment Ambiguity Analysis
- NP / VP / PP attachment ambiguity in fragment sentences
- PP attachment problem: "I saw the man with the telescope"
- Coordinating multiple attachments: noun phrase fragments
- Allen (1994) Chapter 4 — attachment & semantic interpretation
- Hindle & Rooth (1993) lexical association scores for PP attachment
Libraries & Approach
nltkspacynumpy
Build a corpus-based PP attachment classifier using lexical association (Hindle & Rooth method). Count verb-PP vs noun-PP attachments in corpus. Visualise attachment preferences and decision boundaries.
⬛⬛⬜⬜ Intermediate
References + Interview
Classic NLP benchmark — PP attachment was a major research problem 1987–2000. Still relevant for low-resource languages without neural parsers.
02
Topic
Theory + Code
Fragment Parsing: NP, VP & PP Chunks
- Sentence fragments and partial parses (Allen 1994 §4.2)
- NLTK RegexpParser for NP/VP/PP chunking
- IOB labelling: Inside-Outside-Begin scheme
- spaCy noun_chunks and verb phrase extraction
- Comparing chunking vs full parsing accuracy
Libraries & Approach
nltkspacysklearn
Write RegexpParser grammars for NP, VP, PP chunks. Apply to news corpus. Compare IOB accuracy against CoNLL-2000 baseline. Show how fragment parsing feeds downstream semantic analysis.
⬛⬛⬜⬜ Intermediate
References + Interview
Lexical Relations
Relations Among Lexemes & Their Senses
Topics 3 – 4
03
Topic
Theory + Code
WordNet: Synsets, Hypernyms & Lexical Relations
- WordNet (Miller 1995) — synonym sets, lexical relations
- Synsets: synonymy, hypernymy, hyponymy, meronymy, holonymy
- Antonymy and entailment in verbs
- Traversing the WordNet hierarchy: lowest common hypernym
- Semantic similarity: path similarity, Wu-Palmer, Leacock-Chodorow
Libraries & Approach
nltk.corpus.wordnetmatplotlib
Explore WordNet synsets for polysemous words. Traverse hypernym chains. Compute all similarity measures. Visualise the noun hierarchy as a tree. Build a semantic similarity scorer for word pairs.
⬛⬜⬜⬜ Beginner
References + Interview
WordNet underpins NLTK's semantic toolkit. Understanding synsets is prerequisite for WSD, semantic similarity, and ontology-based NLP.
04
Topic
Theory + Code
Distributional Semantics & Word Vectors
- Distributional hypothesis: "a word is known by the company it keeps" (Firth 1957)
- Co-occurrence matrix construction from corpus
- PMI (Pointwise Mutual Information) — Cover & Thomas §2.3
- TF-IDF as a distributional similarity baseline
- Cosine similarity for word vector comparison
Libraries & Approach
numpysklearnscipy
Build a co-occurrence matrix from Brown Corpus. Compute PMI-weighted vectors. Find nearest neighbours by cosine similarity. Contrast distributional similarity vs WordNet path similarity for same word pairs.
⬛⬛⬜⬜ Intermediate
References + Interview
Homonymy & Polysemy
Disambiguation, Limitations & Robust WSD
Topics 5 – 6
05
Topic
Theory + Code
Homonymy vs Polysemy: Detection & Analysis
- Homonymy: bank (river) vs bank (financial) — unrelated meanings
- Polysemy: window (glass pane) vs window (GUI widget) — related meanings
- Monosemy: words with a single sense
- WordNet synset count as polysemy proxy
- Semantic relatedness test: cross-sense similarity
Libraries & Approach
nltk.wordnetmatplotlibpandas
Analyse synset count distributions in WordNet. Classify words as homonymous vs polysemous using inter-synset similarity threshold. Plot polysemy vs word frequency. Show limitations of naive synset counting.
⬛⬜⬜⬜ Beginner
References + Interview
06
Topic
Theory + Code
Limitations of WSD & Sense Granularity
- Sense enumeration problem: where do senses stop?
- Inter-annotator agreement (ITA) on WSD tasks
- SemEval WSD tasks — benchmark history
- Coarse vs fine-grained sense inventories
- The "most frequent sense" (MFS) baseline — surprisingly strong
Libraries & Approach
nltkpandasseaborn
Compute MFS baseline on a word sample. Measure how often MFS is correct. Simulate ITA disagreement. Show that WSD accuracy depends heavily on sense granularity — coarser = easier. Discuss SemEval evaluation methodology.
⬛⬛⬜⬜ Intermediate
References + Interview
WSD Algorithms
Dictionary-Based & Corpus-Based Robust WSD
Topics 7 – 8
07
Topic
Theory + Code
Lesk Algorithm & Dictionary-Based WSD
- Lesk (1986): sense = definition with most overlap with context
- Simplified Lesk vs Extended Lesk (Banerjee & Pedersen 2002)
- WordNet gloss overlap counting
- Context window size effect on accuracy
- Evaluation against SemEval gold annotations
Libraries & Approach
nltk.wordnetnltknumpy
Implement Simplified Lesk from scratch. Implement Extended Lesk (using gloss + examples + hypernym glosses). Evaluate on a set of target words with known senses. Compare window sizes. Visualise overlap scores per sense.
⬛⬛⬜⬜ Intermediate
References + Interview
08
Topic
Theory + Code
Yarowsky's Bootstrapping & One-Sense-Per-Discourse
- Yarowsky (1995): unsupervised WSD with seed words
- "One sense per collocation" and "one sense per discourse" principles
- Decision list learning from collocations
- Log-likelihood ratio for feature selection (Cover & Thomas §2)
- Information gain — Mitchell (1997) Ch. 3
Libraries & Approach
nltknumpycollections
Implement simplified Yarowsky-style decision list for "bank" disambiguation. Use log-likelihood ratio for feature ranking. Apply one-sense-per-discourse rule as post-processing. Measure accuracy against labeled examples.
⬛⬛⬛⬜ Advanced
References + Interview
ML & Info Theory
Machine Learning Approach to WSD
Topic 9
09
Topic
Theory + Code
ML-Based WSD: Naive Bayes, Decision Trees & SVM
- WSD as supervised classification (Mitchell 1997 Ch. 6)
- Feature engineering: surrounding words, POS tags, collocations
- Naive Bayes for WSD (Gale, Church & Yarowsky 1992)
- Decision tree features (Mitchell 1997 Ch. 3) — information gain
- SVM + TF-IDF features for context-window classification
- Information theoretic measures: entropy, mutual information (Cover & Thomas)
Libraries & Approach
sklearnnltknumpypandasmatplotlib
Supervised WSD on SemCor corpus samples for "bank" and "plant". Feature extraction pipeline: context window words, POS tags, local collocations. Compare NB vs Decision Tree vs SVM. Feature importance analysis. Cross-validation evaluation.
⬛⬛⬛⬜ Advanced
References + Interview
Connects Module 3 ML (SVM, DT) to semantic NLP. Mitchell Ch. 3 decision trees + Ch. 6 Naive Bayes directly apply here.