PhD Thesis
The official and final version of my PhD thesis dissertation is available online. The official title is "Inference in Switching Linear Dynamical Systems Applied to Noise Robust Speech Recognition of Isolated Digits". The dissertation is mainly talking about how to perform approximate inference in Switching and Bayesian Switching Linear Dynamical Systems. The devised algorithms are applied to the automatic recognition of isolated spoken digits in noisy environments.
Inference in Switching Linear Dynamical Systems
Inference in Switching Linear Dynamical Systems (SLDSs) is known to be intractable. Various approximation algorithms are available, all have some drawbacks; some are numerical unstable, other do not give statisfactory approximations or are too limited. For that reason we developed the Expectation Correction (EC) algorithm. The algorithm and its performance compared to other methods are described in a research report.
See the page related to the Expectation Correction algorithm for an optimised Matlab C Mex implementation of the EC algorithm as well as a demonstration code.
Noise Robust Speech Recognition
Real world applications such as hands-free dialling in cars may have to deal with potentially very noisy environments. Existing state-of-the-art solutions to this problem use feature-based HMMs, with a preprocessing stage to clean the noisy signal. However, the effect that raw signal noise has on the induced HMM features is poorly understood, and limits the performance of the HMM system. An alternative to feature-based HMMs is to model the raw signal, which has the potential advantage that including an explicit noise model is straightforward. The dynamics of both the raw speech signal and the noise can be jointly modelled by using a Switching Linear Dynamical System (SLDS). Experiments have shown that, under noisy conditions, the SLDS significantly outperforms a state-of-the-art feature-based HMM on an isolated digit recognition task. The model and the experiments that where carried out are described in a research report.
See the pages related to the SAR-HMM and AR-SLDS for the source code used for the expermiments with these models. The perl script which generates the required files from the TIDIGITS database can be executed as follows:
./setup <root directory of the TIDIGITS database> <to directory>
All the files required to train and test the SAR-HMM and the AR-SLDS on clean conditions will be stored under directory <to directory>
.