Next: Discussion
Up: Segmentation of annotated data
Previous: The training procedure
  Contents
After the first phase of training with the small data set, the
segmentations done by the algorithm seem pretty random. This can be
seen from Figure 7.5. The model has not even learnt
how to separate the speech signal from the silence at the beginning
and end of the data segments.
Figure 7.5:
An example of the segmentation given by the algorithm after the
first phase of learning. The states `' and `' correspond
to silence at the beginning and the end of the utterance. The
first subfigure shows the marginal probabilities of the HMM
states for each sample. The second subfigure shows the data,
the third shows the continuous hidden states
and the last
shows the innovation processes
. The
HMM does its segmentation solely based on the values of the
innovation process, i.e. the last subfigure. The word in the
figure is ``VASEN'' meaning ``left''.
|
After the full training the segmentations seem rather good, as
Figures 7.6 and 7.7 show. This is a
very encouraging result, considering that the segmentations are
performed using only the innovation process (the last subfigure) of
the NSSM which consists mostly of the leftovers of the other parts of
the model. The results should be significantly better with a model
that gives the HMM a larger part in predicting the data.
Figure 7.6:
An example of the segmentation given by the algorithm after
complete learning. The data and the meanings of the different
parts of the figure are the same as in
Figure 7.5. The results are significantly better
though not yet quite perfect.
|
Figure 7.7:
Another example of segmentation given by the algorithm after
complete learning. The meanings of the different parts of the
figure are the same as in Figures 7.5 and
7.6. The figure illustrates the segmentation of
a longer word. The result shows several relatively probable
paths, not just one as in the previous figures. The word in the
figure is ``POHJANMAALLA''. The double phonemes are treated as
one in the segmentation.
|
Next: Discussion
Up: Segmentation of annotated data
Previous: The training procedure
  Contents
Antti Honkela
2001-05-30