HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins

TitleHMMSTR: a hidden Markov model for local sequence-structure correlations in proteins
Publication TypeJournal Article
Year of Publication2000
AuthorsBystroff, C., Thorsson V., & Baker D.
JournalJournal of molecular biology
Date Published2000 Aug 4
KeywordsAmino Acid Motifs, Computational Biology, Computer Simulation, Databases, Factual, Markov Chains, Models, Molecular, Primary Publication, Protein Structure, Secondary, Proteins, Reproducibility of Results, Sequence Alignment

We describe a hidden Markov model, HMMSTR, for general protein sequence based on the I-sites library of sequence-structure motifs. Unlike the linear hidden Markov models used to model individual protein families, HMMSTR has a highly branched topology and captures recurrent local features of protein sequences and structures that transcend protein family boundaries. The model extends the I-sites library by describing the adjacencies of different sequence-structure motifs as observed in the protein database and, by representing overlapping motifs in a much more compact form, achieves a great reduction in parameters. The HMM attributes a considerably higher probability to coding sequence than does an equivalent dipeptide model, predicts secondary structure with an accuracy of 74.3 %, backbone torsion angles better than any previously reported method and the structural context of beta strands and turns with an accuracy that should be useful for tertiary structure prediction.

Alternate JournalJ. Mol. Biol.
bystroff00A.pdf730.85 KB