Modeling Biological Sequences Using HTK

William Noble Grundy

Technical report prepared for
Entropic Research Laboratory, Inc. March, 1997.


Entropic Research Laboratory's Hidden Markov Model Toolkit (HTK) is a software toolkit for designing and implementing state-of-the-art speech recognition systems. Recently, hidden Markov models (HMMs) have been applied by computational biologists to the task of characterizing families of related protein or DNA sequences. This report describes how HTK can be applied to this new problem domain. Using HTK, computational biologists can bring to bear many of the HMM techniques and algorithms developed by speech recognition researchers over the past two decades.