An HDP-HMM for Systems with State Persistence (2008)

Authors

Abstract

The hierarchical Dirichlet process hidden Markov model (HDP-HMM) is a flexible, nonparametric model which allows state spaces of unknown size to be learned from data. We demonstrate some limitations of the original HDP-HMM formulation, and propose a sticky extension which allows more robust learning of smoothly varying dynamics. Using DP mixtures, this formulation also allows learning of more complex, multimodal emission distributions. We further develop a sampling algorithm that employs a truncated approximation of the DP to jointly resample the full state sequence, greatly improving mixing rates. Via extensive experiments with synthetic data and the NIST speaker diarization database, we demonstrate the advantages of our sticky extension, and the utility of the HDP-HMM in real-world applications.

Discussion

Yee Whye Teh, 2008/07/06 04:12

An interesting paper and good results.

Somebody asked about using supervisory data to train the speaker diarization system. I wonder if we can use the idea of state-splitting (Petrov et al, COLING-ACL 2006) to do this, where the supervisory labels are treated as a state sequence? Each state is further split into multiple states and modelled with a HDP-HMM.

William Cohen, 2008/07/06 09:23

I liked this paper also. I didn't ask, but an alternative approach would be to look at semi-Markov processes - where there is an explicit distribution over the tine you stay in each state. That would be a little more powerful - since time-in-a-state doesn't have to be exponentially distributed - but it's not immediately obvious how one would implement an HDP-SMM.

Yee Whye Teh, 2008/07/06 09:43

A semi-Markov extension is interest. It makes sense as the current sticky HDP-HMM model assumes a geometric distribution over the length of times the model stays in each state, and this is not necessarily what we believe about our data. From an inference algorithm or model-building perspective I don't see any problems.

Mike, 2010/07/28 09:31

It great that it was extended to include the self transition parameter. This makes a very big difference: combined with some other neat ideas they were able to present some impressive performance on speaker diarization. I'm quite delighted with it.

serseri, 2011/08/16 02:37

sexi men

juegos de bob esponja, 2012/03/20 13:19

model assumes a geometric distribution over the length of times the model stays in each state, and this is not necessarily what we believe about our data. From an inference

juegos de bob esponja, 2012/03/20 13:19

model assumes a geometric distribution over the length of times the model stays in each state, and this is not necessarily what we believe about our data. From an inference

facebook timeline covers, 2012/03/20 13:20

look at semi-Markov processes - where there is an explicit distribution over the tine you stay in each state. That would be a little more powerfu

Enter your comment (wiki syntax is allowed):
DBORO
 
paper/2008/305.txt · Last modified: 2009/05/24 18:48 (external edit)
 
Driven by DokuWiki