NLP Module 4 — Semantics, WSD & Lexical Relations

Phrase Attachment Structural Ambiguity in NP / VP / PP Topics 1 – 2

01

Topic

Theory + Code

Phrase Attachment Ambiguity Analysis

NP / VP / PP attachment ambiguity in fragment sentences
PP attachment problem: "I saw the man with the telescope"
Coordinating multiple attachments: noun phrase fragments
Allen (1994) Chapter 4 — attachment & semantic interpretation
Hindle & Rooth (1993) lexical association scores for PP attachment

Libraries & Approach

nltkspacynumpy

Build a corpus-based PP attachment classifier using lexical association (Hindle & Rooth method). Count verb-PP vs noun-PP attachments in corpus. Visualise attachment preferences and decision boundaries.

⬛⬛⬜⬜ Intermediate

References + Interview

Allen Ch.4PP attachment Hindle & Rooth 1993Structural ambiguity

Classic NLP benchmark — PP attachment was a major research problem 1987–2000. Still relevant for low-resource languages without neural parsers.

02

Topic

Theory + Code

Fragment Parsing: NP, VP & PP Chunks

Sentence fragments and partial parses (Allen 1994 §4.2)
NLTK RegexpParser for NP/VP/PP chunking
IOB labelling: Inside-Outside-Begin scheme
spaCy noun_chunks and verb phrase extraction
Comparing chunking vs full parsing accuracy

Libraries & Approach

nltkspacysklearn

Write RegexpParser grammars for NP, VP, PP chunks. Apply to news corpus. Compare IOB accuracy against CoNLL-2000 baseline. Show how fragment parsing feeds downstream semantic analysis.

⬛⬛⬜⬜ Intermediate

References + Interview

IOB schemeChunking vs parsing Fragment semanticsCoNLL-2000

Lexical Relations Relations Among Lexemes & Their Senses Topics 3 – 4

03

Topic

Theory + Code

WordNet: Synsets, Hypernyms & Lexical Relations

WordNet (Miller 1995) — synonym sets, lexical relations
Synsets: synonymy, hypernymy, hyponymy, meronymy, holonymy
Antonymy and entailment in verbs
Traversing the WordNet hierarchy: lowest common hypernym
Semantic similarity: path similarity, Wu-Palmer, Leacock-Chodorow

Libraries & Approach

nltk.corpus.wordnetmatplotlib

Explore WordNet synsets for polysemous words. Traverse hypernym chains. Compute all similarity measures. Visualise the noun hierarchy as a tree. Build a semantic similarity scorer for word pairs.

⬛⬜⬜⬜ Beginner

References + Interview

Miller 1995 WordNetSynset Wu-Palmer simHypernymy Allen Ch.7

WordNet underpins NLTK's semantic toolkit. Understanding synsets is prerequisite for WSD, semantic similarity, and ontology-based NLP.

04

Topic

Theory + Code

Distributional Semantics & Word Vectors

Distributional hypothesis: "a word is known by the company it keeps" (Firth 1957)
Co-occurrence matrix construction from corpus
PMI (Pointwise Mutual Information) — Cover & Thomas §2.3
TF-IDF as a distributional similarity baseline
Cosine similarity for word vector comparison

Libraries & Approach

numpysklearnscipy

Build a co-occurrence matrix from Brown Corpus. Compute PMI-weighted vectors. Find nearest neighbours by cosine similarity. Contrast distributional similarity vs WordNet path similarity for same word pairs.

⬛⬛⬜⬜ Intermediate

References + Interview

PMI (Cover & Thomas)Distributional hyp. Co-occurrence matrixCosine similarity

Homonymy & Polysemy Disambiguation, Limitations & Robust WSD Topics 5 – 6

05

Topic

Theory + Code

Homonymy vs Polysemy: Detection & Analysis

Homonymy: bank (river) vs bank (financial) — unrelated meanings
Polysemy: window (glass pane) vs window (GUI widget) — related meanings
Monosemy: words with a single sense
WordNet synset count as polysemy proxy
Semantic relatedness test: cross-sense similarity

Libraries & Approach

nltk.wordnetmatplotlibpandas

Analyse synset count distributions in WordNet. Classify words as homonymous vs polysemous using inter-synset similarity threshold. Plot polysemy vs word frequency. Show limitations of naive synset counting.

⬛⬜⬜⬜ Beginner

References + Interview

Homonymy vs polysemySense granularity Allen Ch.7 §7.2Fine vs coarse WSD

06

Topic

Theory + Code

Limitations of WSD & Sense Granularity

Sense enumeration problem: where do senses stop?
Inter-annotator agreement (ITA) on WSD tasks
SemEval WSD tasks — benchmark history
Coarse vs fine-grained sense inventories
The "most frequent sense" (MFS) baseline — surprisingly strong

Libraries & Approach

nltkpandasseaborn

Compute MFS baseline on a word sample. Measure how often MFS is correct. Simulate ITA disagreement. Show that WSD accuracy depends heavily on sense granularity — coarser = easier. Discuss SemEval evaluation methodology.

⬛⬛⬜⬜ Intermediate

References + Interview

MFS baselineSemEval ITA in WSDSense inventory

WSD Algorithms Dictionary-Based & Corpus-Based Robust WSD Topics 7 – 8

07

Topic

Theory + Code

Lesk Algorithm & Dictionary-Based WSD

Lesk (1986): sense = definition with most overlap with context
Simplified Lesk vs Extended Lesk (Banerjee & Pedersen 2002)
WordNet gloss overlap counting
Context window size effect on accuracy
Evaluation against SemEval gold annotations

Libraries & Approach

nltk.wordnetnltknumpy

Implement Simplified Lesk from scratch. Implement Extended Lesk (using gloss + examples + hypernym glosses). Evaluate on a set of target words with known senses. Compare window sizes. Visualise overlap scores per sense.

⬛⬛⬜⬜ Intermediate

References + Interview

Lesk 1986Gloss overlap Dictionary-basedExtended Lesk Allen Ch.7

08

Topic

Theory + Code

Yarowsky's Bootstrapping & One-Sense-Per-Discourse

Yarowsky (1995): unsupervised WSD with seed words
"One sense per collocation" and "one sense per discourse" principles
Decision list learning from collocations
Log-likelihood ratio for feature selection (Cover & Thomas §2)
Information gain — Mitchell (1997) Ch. 3

Libraries & Approach

nltknumpycollections

Implement simplified Yarowsky-style decision list for "bank" disambiguation. Use log-likelihood ratio for feature ranking. Apply one-sense-per-discourse rule as post-processing. Measure accuracy against labeled examples.

⬛⬛⬛⬜ Advanced

References + Interview

Yarowsky 1995Decision list One-sense-per-discLog-likelihood Bootstrapping

ML & Info Theory Machine Learning Approach to WSD Topic 9

09

Topic

Theory + Code

ML-Based WSD: Naive Bayes, Decision Trees & SVM

WSD as supervised classification (Mitchell 1997 Ch. 6)
Feature engineering: surrounding words, POS tags, collocations
Naive Bayes for WSD (Gale, Church & Yarowsky 1992)
Decision tree features (Mitchell 1997 Ch. 3) — information gain
SVM + TF-IDF features for context-window classification
Information theoretic measures: entropy, mutual information (Cover & Thomas)

Libraries & Approach

sklearnnltknumpypandasmatplotlib

Supervised WSD on SemCor corpus samples for "bank" and "plant". Feature extraction pipeline: context window words, POS tags, local collocations. Compare NB vs Decision Tree vs SVM. Feature importance analysis. Cross-validation evaluation.

⬛⬛⬛⬜ Advanced

References + Interview

Mitchell Ch.3,6Cover & Thomas Gale et al. 1992Information gain Supervised WSDSemCor

Connects Module 3 ML (SVM, DT) to semantic NLP. Mitchell Ch. 3 decision trees + Ch. 6 Naive Bayes directly apply here.

Topic	Category	Difficulty	Key Library	Builds On	Reference
01 — PP Attachment Analysis	Phrase Attachment	⬛⬛⬜⬜	nltk, spacy	Module 3 CFG	Allen Ch.4, Hindle & Rooth
02 — NP/VP/PP Chunking (IOB)	Phrase Attachment	⬛⬛⬜⬜	nltk, sklearn	Topic 01	Allen Ch.4, CoNLL-2000
03 — WordNet: Synsets & Similarity	Lexical Relations	⬛⬜⬜⬜	nltk.wordnet	Module 2 Morphology	Allen Ch.7, Miller 1995
04 — PMI & Distributional Semantics	Lexical Relations	⬛⬛⬜⬜	numpy, sklearn	Topic 03	Cover & Thomas §2.3
05 — Homonymy vs Polysemy Analysis	Homo/Polysemy	⬛⬜⬜⬜	nltk.wordnet	Topic 03	Allen Ch.7 §7.2
06 — WSD Limitations & MFS Baseline	Homo/Polysemy	⬛⬛⬜⬜	nltk, pandas	Topics 03–05	Allen Ch.7, SemEval
07 — Lesk Algorithm (Dict-Based WSD)	WSD Algorithms	⬛⬛⬜⬜	nltk.wordnet	Topics 05–06	Lesk 1986, Allen Ch.7
08 — Yarowsky Decision List WSD	WSD Algorithms	⬛⬛⬛⬜	nltk, numpy	Topics 06–07	Yarowsky 1995, Cover & Thomas
09 — ML WSD: NB + DT + SVM	ML Approach	⬛⬛⬛⬜	sklearn, nltk	All above	Mitchell Ch.3,6; Gale et al. 1992

Semantics, Lexical Relations
& Word Sense Disambiguation

// Semantics & WSD — Historical Timeline

// Teaching Sequence

Semantics, Lexical Relations& Word Sense Disambiguation

// Semantics & WSD — Historical Timeline

// Teaching Sequence

Semantics, Lexical Relations
& Word Sense Disambiguation