Improved learning for hidden Markov models using penalized training

Keller, Bill; Lutz, Rudi

File(s) not publicly available

Improved learning for hidden Markov models using penalized training

presentation

posted on 2023-06-07, 21:41 authored by Bill Keller, Rudi Lutz

In this paper we investigate the performance of penalized variants of the forwards-backwards algorithm for training Hidden Markov Models. Maximum likelihood estimation of model parameters can result in over-fitting and poor generalization ability. We discuss the use of priors to compute maximum a posteriori estimates and describe a number of experiments in which models are trained under different conditions. Our results show that MAP estimation can alleviate over-fitting and help learn better parameter estimates.

History

Publication status

Published

ISSN

0302-9743

Publisher

Springer-Verlag

Volume

2464

Pages

8.0

Presentation Type

paper

Event name

AICS 02: Proceedings of the 13th Irish International Conference on Artificial Intelligence and CognitiveScience

Event location

LIMERICK, IRELAND

Event type

conference

ISBN

3540441840

Department affiliated with

Informatics Publications

Notes

Originality: This was the first application within NLP of penalised training of Hidden Markov Models using Dirichlet priors over the emission probabilities of the model. Rigour: The paper derived the necessary EM update rule incorporating the Dirichlet prior, and described emiprical results comparing learning with this prior with several other priors recommended in the literature. The data consisted of the first 5000 POS tagged sentences from the BNC corpus, split into training and test sets. All results were obtained using 10-fold cross validation, and were shown to be statistically significant. Significance: The paper showed that the use of Dirichlet priors (with the Dirichlet distribution parameters set proportional to the normalised frequencies of the observation symbols in the training data) consistently enabled the learning of better performing models. This result was robust across model sizes and variations in initial conditions. Additionally, the results cast doubt on claims by Brand that minimum entropy priors gave good results, suggesting the need for further work in this area. Since this paper was written use of Dirichlet priors (and more recently Dirichlet Process priors) has become widespread. Outlet: this was a fully (3 referees) refereed international conference

Full text available

No

Peer reviewed?

Yes

Editors

RFE Sutcliffe, M Oneill, M Eaton, C Ryan, NJL Griffith

Legacy Posted Date

2012-02-06

Usage metrics

Keywords

Uncategorised value

Licence

Copyright not evaluated

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) not publicly available

Improved learning for hidden Markov models using penalized training

History

Publication status

ISSN

Publisher

Volume

Pages

Presentation Type

Event name

Event location

Event type

ISBN

Department affiliated with

Notes

Full text available

Peer reviewed?

Editors

Legacy Posted Date

Usage metrics

Categories

Keywords

Licence

Exports