Switching Autoregressive HMM

Source Code

The code is written in Objective Camlopen in new window. The linear algebra operations are done with the GNU Scientific Libraryopen in new window which is called from OCaml via the ocamlgslopen in new window wrapper. It can be compiled by issuing the following commands in a Unix shell:

tar -xf arhmm-1.2.tar.bz2
cd arhmm-1.2
make

Training

For example, if we want to train a 10 states, 10-th order SAR-HMM, we first need to create and initialise a model.

./arhmm_init -s 10 -r 10 train.lst arhmm-initialised.dat

Where -s 10 and -r 10 indicate that we want 10 states and a 10-th order process. The file train.lst contains the list of files which we want to use as training data; one file name per line. Each training file must contain a single column of numbers (i.e., one number per line) representing the sequence of signal samples. The last argument is the file where the initialised model will be stored.

The model can then be trained.

./arhmm_train arhmm-initialised.dat train.lst arhmm-trained.dat

The first argument is the file containing the initialised model, the second is the list of training files and the third is the file where the trained model will be stored.

Testing

The model accuracy of the trained models can be evaluated on a test dataset.

./arhmm_eval models.lst test.lst

Compared to the other commands, arhmm_eval does not take a single model file in input, but rather a list of models. This is because the goal of the evaluation is to evaluate the performance of a model against some other models. The file models.lst expects one model file per line.

Here is an example of the output produced by arhmm_eval.

test/1a.dat:
        3.404827e+00
        3.315523e+00

        Best is trained/arhmm-1.dat

test/1b.dat:
        3.651752e+00
        3.588489e+00

        Best is trained/arhmm-1.dat

The first line indicates the utterance considered, in this case a "one". The next two lines give the log likelihood of the utterance for each model, in the same order as in models.lst. The next line indicates, in a human readable form, the most likely model. In this example both utterances have been correctly identified as a "one".

Trained TI-DIGITS Models

The models obtained after training on the single digit utterances of the TI-DIGITS database are available. There is one model for each of the eleven digits (0-9 and "oh").