Martin Schafföner

Estimation of Nonparametric Probability Density Functions with Applications to Automatic Speech Recognition

Schätzung von nichtparametrischen Wahrscheinlichkeitsdichtefunktionen, mit Anwendungen auf Automatische Spracherkennung

Thesis

Filetyp: PDF (.pdf)
Size: 1516 Kb

Sachgruppe der DNB
28 Informatik, Datenverarbeitung


Doctoral Dissertation accepted by: Otto-von-Guericke-Universität Magdeburg , The Faculty of Electrical Engineering and Information Technology, 2007-11-28

Abstract

During the last decade, a new learning paradigm called Structural Risk Minimization (SRM) derived from Statistical Learning Theory, has become widely studied in machine learning. Machines implementing SRM, e. g., Support Vector Machines (SVMs) and Kernel Fisher Discriminants (KFDs), have been very successfully used for solving pattern recognition and function regression problems. SRM's ability to simultaneously minimize the risk of error on training data and the complexity of a learning machine results in better generalization capability than plain Empirical Risk Minimization (ERM), especially if the amount of training data is limited. The present work is devoted to applying SRM to the problem of probability density function (PDF) estimation. When modeling sequences of continuous-valued events using Hidden Markov Models (HMMs), e. g., automatic speech recognition (ASR), PDFs are used to model the emission probabilities of the HMMs' states. This thesis investigates and develops methods to efficiently train sparse kernel PDF models by regression of the empirical cumulative distribution function (ECDF). A new method for obtaining a sparse approximation of the orthogonal least-squares regression solution by forward-selection of relevant samples is presented, where a novel memory-efficient thin update of the orthogonal decomposition is used. This method is evaluated on standard benchmark problems of up to five dimensions, showing superior performance to traditional parametric Gaussian Mixture Models (GMMs) and similar performance to the theoretically optimal, non-sparse Parzen windows PDF models. However, it is found that this new method cannot be applied to the problem of estimating PDFs for ASR due to the complexity of the ECDF in high dimensions. Instead, posterior class probabilities calibrated from the outputs of binary discriminants such as SVMs or KFDs are turned into class-conditional PDFs using Bayes' rule. This approach is tested within a monophone HMM ASR system on the Resource Management task, outperforming traditional HMM-GMM systems significantly, especially on random limited samples which demonstrates the new models' improved generalization ability on small-sample problems. In order to realize these large-scale experiments, a novel machine learning software library is presented. Primary focus is put on fast computations, simplicity both in terms of expressing algorithms and extending functionality, and flexibility in order to properly appreciate algorithms' properties and advantages. The software library follows an object-oriented design and has been implemented in C++. For productivity, the library is equipped with fine-grained tracing, an object-oriented persistence model, transparent error handling and parallelization on distributedmemory computer clusters.

Betreuer Prof. Dr. rer. nat. Andreas Wendemuth
Gutachter Prof. Dr.-Ing. habil. Rüdiger Hoffmann

Upload: 2008-02-01
URL of Theses: http://diglib.uni-magdeburg.de/Dissertationen/2007/marschaffoener.pdf

Otto-von-Guericke-Universität Magdeburg , Universitätsbibliothek
Universitätsplatz 2 , D - 39106 Magdeburg