Graph-based approaches to word sense induction

Hope, David Richard (2015) Graph-based approaches to word sense induction. Doctoral thesis (PhD), University of Sussex.

PDF - Published Version
Download (1MB) | Preview


This thesis is a study of Word Sense Induction (WSI), the Natural Language Processing (NLP) task of automatically discovering word meanings from text. WSI is an open problem in NLP whose solution would be of considerable benefit to many other NLP tasks. It has, however, has been studied by relatively few NLP researchers and often in set ways. Scope therefore exists to apply novel methods to the problem, methods that may improve
upon those previously applied. This thesis applies a graph-theoretic approach to WSI. In this approach, word senses are identifed by finding particular types of subgraphs in word co-occurrence graphs. A number of original methods for constructing, analysing, and partitioning graphs are introduced, with these methods then incorporated into graphbased WSI systems. These systems are then shown, in a variety of evaluation scenarios, to return results that are comparable to those of the current best performing WSI systems. The main contributions of the thesis are a novel parameter-free soft clustering algorithm that runs in time linear in the number of edges in the input graph, and novel generalisations of the clustering coeficient (a measure of vertex cohesion in graphs) to the weighted case. Further contributions of the thesis include: a review of graph-based WSI systems that have been proposed in the literature; analysis of the methodologies applied in these systems; analysis of the metrics used to evaluate WSI systems, and empirical evidence to verify the usefulness of each novel method introduced in the thesis for inducing word senses.

Item Type: Thesis (Doctoral)
Schools and Departments: School of Engineering and Informatics > Informatics
Subjects: P Language and Literature > P Philology. Linguistics > P0098 Computational linguistics. Natural language processing
Depositing User: Library Cataloguing
Date Deposited: 13 Mar 2015 12:59
Last Modified: 28 Sep 2015 13:33

View download statistics for this item

📧 Request an update