NLP Module 5 — Discourse, Coreference & Text Coherence

CoreferenceReference Phenomena & Syntactic/Semantic ConstraintsTopics 1–2

Topic

Theory + Code

Reference Phenomena & Coreference Chain Detection

Referring expressions: definite NPs, pronouns, names (Allen Ch.11)
Anaphora, cataphora, exophora, bridging references
Building coreference chains: entity clusters across sentences
spaCy neural coreference + rule-based fallback
MUC, B³, CEAF evaluation metrics

Libraries

spacynltknumpynetworkx

Detect and visualise coreference chains in news text. Build entity mention graph. Evaluate against gold annotations. Compare rule-based vs neural approaches.

⬛⬛⬜⬜ Intermediate

References

Allen Ch.11Anaphora typesMUC metricEntity cluster

Coreference is prerequisite for QA, summarisation, and IE. OntoNotes benchmark still the standard evaluation.

Topic

Theory + Code

Syntactic & Semantic Constraints on Coreference

Chomsky (1981) Binding Theory: Principles A, B, C
Principle A: reflexives locally bound
Principle B: pronouns free in local domain
Principle C: R-expressions free everywhere
Gender, number, animacy agreement as filters

Libraries

spacynltknetworkx

Implement Binding Theory constraint checker. Filter candidate antecedents using gender/number/animacy agreement. Visualise constraint violations. Evaluate precision of constraint filtering.

⬛⬛⬜⬜ Intermediate

References

Chomsky BindingAllen §11.2Principles A,B,CAgreement filters

Pronoun InterpretationPreferences in Pronoun InterpretationTopics 3–4

Topic

Theory + Code

Centering Theory & Salience Preferences

Recency preference: most recently mentioned antecedent
Grammatical role: subject > object > adjunct
Centering Theory (Grosz, Joshi & Weinstein 1995)
Forward-looking centers (Cf), backward-looking center (Cb)
Transition types: Continue, Retain, Shift, Rough-Shift

Libraries

spacynltknumpy

Implement Centering: compute Cf lists, identify Cb, classify discourse transitions. Show pronoun interpretation preferences align with centering transitions. Visualise entity salience.

⬛⬛⬛⬜ Advanced

References

Grosz et al. 1995Allen §11.3Cf/Cb centersTransition types

Topic

Theory + Code

Gender, Animacy & Pleonastic "it" Detection

Gender agreement: he/she/it/they morphological matching
Animacy hierarchy: human > animate > inanimate
Pleonastic "it": "It is raining" — non-referential
Naïve Bayes classifier for gender/animacy (Mitchell Ch.6)
Semantic class lookup from WordNet

Libraries

spacysklearnnltk

Build gender/animacy NP classifier. Detect pleonastic "it" with syntactic patterns. Apply Naïve Bayes (Mitchell Ch.6) for referent type classification. Evaluate on annotated examples.

⬛⬛⬜⬜ Intermediate

References

Mitchell Ch.6Allen §11.3Animacy hierarchyPleonastic it

AlgorithmsPronoun Resolution AlgorithmsTopics 5–6

Topic

Theory + Code

Hobbs' Naive Algorithm for Pronoun Resolution

Hobbs (1978): left-to-right tree-traversal for pronoun binding
BFS over parse tree nodes, skip non-NP categories
Morphological filter: gender, number, person match
Application to NLTK constituency parse trees
Evaluation vs naive recency baseline

Libraries

nltkspacynumpy

Implement Hobbs' tree-traversal from scratch over NLTK parse trees. Apply morphological filters. Test on example passages. Compare accuracy vs recency heuristic. Visualise traversal path on tree.

⬛⬛⬛⬜ Advanced

References

Hobbs 1978Allen §11.4Tree traversalNaive algorithm

Achieves ~80% on simple pronouns. Classic baseline in every coreference system evaluation.

Topic

Theory + Code

ML Coreference: Feature-Based Pairwise Classifier

Soon, Ng & Lim (2001): pairwise binary coreference classifier
Features: distance, string match, gender, syntactic role, animacy
Decision Tree on mention pairs (Mitchell 1997 Ch.3)
Information gain for feature selection (Cover & Thomas)
Evaluation: MUC, B³, CEAF metrics

Libraries

sklearnspacynumpypandas

Build mention-pair coreference classifier. Extract features: sentence distance, syntactic role, string similarity, agreement. Cross-validate Decision Tree and Logistic Regression. Feature importance analysis using information gain.

⬛⬛⬛⬜ Advanced

References

Soon et al. 2001Mitchell Ch.3Cover & ThomasPairwise model

Text CoherenceCoherence ModellingTopics 7–8

Topic

Theory + Code

Entity Grid Model of Text Coherence

Barzilay & Lapata (2008): entity grid coherence model
Grid: sentences × entities, roles S/O/X/—
Transition probabilities from role sequences
Cover & Thomas: entropy of transition distribution
Coherence score: discriminate coherent vs shuffled text

Libraries

spacynumpypandasseaborn

Build entity grid from multi-sentence text. Compute entity transition probabilities. Score coherence. Compare coherent vs randomly shuffled paragraphs. Visualise grid as heatmap.

⬛⬛⬜⬜ Intermediate

References

Barzilay 2008Allen §12.1Entity gridCover & Thomas

Topic

Theory + Code

Perplexity & Lexical Coherence Scoring

Halliday & Hasan (1976): lexical cohesion devices
Sentence-to-sentence semantic similarity as coherence proxy
Perplexity as coherence measure (Cover & Thomas §2)
Conditional entropy H(Sₙ|Sₙ₋₁) between sentences
Sentence ordering: coherent vs shuffled discrimination

Libraries

nltksklearnnumpyscipy

Compute lexical chain density. Measure sentence cosine similarity via TF-IDF. Compute n-gram perplexity (Cover & Thomas). Discriminate coherent from shuffled text. Correlate with human judgements.

⬛⬛⬜⬜ Intermediate

References

Cover & Thomas §2Allen §12.2PerplexityLexical cohesion

Discourse StructureRST, Discourse Relations & Topic SegmentationTopic 9

Topic

Theory + Code

RST, Discourse Connectives & TextTiling Segmentation

RST (Mann & Thompson 1988): Elaboration, Contrast, Cause, Evidence
Discourse connective classification: causal, temporal, adversative
TextTiling (Hearst 1997): topic segmentation via lexical similarity
Grosz & Sidner (1986): intentional vs attentional structure
WindowDiff evaluation metric for segmentation

Libraries

nltksklearnnumpyspacy

Classify discourse connectives into RST relation types. Implement TextTiling for topic segmentation. Build discourse relation graph. Evaluate segmentation with WindowDiff. Compare to NLTK TextTilingTokenizer.

⬛⬛⬛⬜ Advanced

References

Mann & Thompson 1988Allen §12.3Hearst 1997Grosz & SidnerRST relations

// Discourse & Coreference Historical Timeline

1960s

Chomsky — Generative Grammar & Binding Roots

Syntactic structures that constrain pronoun interpretation. Formal basis for binding theory developed 1960s–1981.

1976

Halliday & Hasan — "Cohesion in English"

First systematic analysis of cohesion: reference, substitution, ellipsis, conjunction, lexical cohesion. Foundation of all coherence work.

1978

Hobbs — "Resolving Pronoun References"

Classic tree-traversal algorithm. ~80% accuracy on simple pronouns. Still used as a baseline today.

1981

Chomsky — Binding Theory: Principles A, B, C

Formal syntactic constraints on coreference. Reflexives (A), pronouns (B), R-expressions (C). Core of Module 5 syntactic constraints.

1986

Grosz & Sidner — "Attention, Intentions & Structure of Discourse"

Computational discourse theory: linguistic, intentional, attentional components. Cl 12(3). Foundation of Centering Theory.

1988

Mann & Thompson — Rhetorical Structure Theory

RST: hierarchical trees of discourse relations. Foundational for discourse parsing and NLG.

1994

Allen — "Natural Language Understanding" 2nd ed.

Ch. 11 (Reference) and Ch. 12 (Discourse) — comprehensive computational treatment. Core textbook for this module.

1995

Grosz, Joshi & Weinstein — Centering Theory

CL 21(2): entity salience and pronoun preferences. Cb, Cf, transition types. Unifies recency + grammatical role preferences.

2001

Soon, Ng & Lim — ML Coreference

Pairwise binary classifier with Decision Tree. First widely successful ML coreference approach. Mitchell Ch.3 directly applicable.

2008

Barzilay & Lapata — Entity Grid Coherence

Entity grid with syntactic roles predicts coherence. Transition probabilities as score. TACL 2008.

2018+

BERT End-to-End Coreference (Lee et al.)

Neural span scoring with BERT. 73% F1 on OntoNotes. Classical methods still important for interpretable and low-resource systems.

Topic	Category	Difficulty	Core Library	Builds On	Reference
01 — Reference Phenomena & Coref Chains	Coreference	⬛⬛⬜⬜	spacy	Module 3 parsing	Allen Ch.11
02 — Binding Theory Constraints	Coreference	⬛⬛⬜⬜	spacy, networkx	Topic 01	Allen §11.2, Chomsky 1981
03 — Centering Theory	Pronoun Interp.	⬛⬛⬛⬜	spacy, nltk	01–02	Grosz et al. 1995, Allen §11.3
04 — Gender, Animacy & Pleonastic It	Pronoun Interp.	⬛⬛⬜⬜	spacy, sklearn	01–03	Mitchell Ch.6, Allen §11.3
05 — Hobbs' Algorithm	Resolution	⬛⬛⬛⬜	nltk, spacy	01–04	Hobbs 1978, Allen §11.4
06 — ML Coreference Classifier	Resolution	⬛⬛⬛⬜	sklearn, spacy	01–05	Mitchell Ch.3, Soon et al. 2001
07 — Entity Grid Coherence	Text Coherence	⬛⬛⬜⬜	spacy, numpy	01–06	Allen §12.1, Barzilay 2008
08 — Perplexity & Lexical Coherence	Text Coherence	⬛⬛⬜⬜	nltk, sklearn	07	Cover & Thomas §2, Allen §12.2
09 — RST & TextTiling	Discourse Struct	⬛⬛⬛⬜	nltk, spacy	07–08	Allen §12.3, Mann & Thompson 1988

Discourse Reference
Resolution & Text Coherence

// Discourse & Coreference Historical Timeline

// Teaching Sequence

Discourse ReferenceResolution & Text Coherence

// Discourse & Coreference Historical Timeline

// Teaching Sequence

Discourse Reference
Resolution & Text Coherence