Merja Oja, Jaakko Peltonen, Jonas Blomberg, and Samuel Kaski. Estimating human endogenous retrovirus activities in various tissues with a hidden Markov mixture model. Poster in Intelligent Systems for Molecular Biology & European Conference on Computational Biology 2007 (ISMB/ECCB). Vienna, Austria, July 21-25, 2007.

Human endogenous retroviruses (HERVs) are remnants of ancient retrovirus infections and now reside within the human DNA. HERVs are interesting for two reasons: they can express viral genes in human tissues, and their presence in the genome may affect the function of nearby human genes.

In this work we analyze the expressions of individual HERVs in several human tissues. Earlier, we studied the overall activity of individual HERVs [1]. Almost all previous studies on HERV expression report activities only for groups of HERVs.

To find evidence of HERV expression, we use the expressed sequence tags (ESTs). The amount of ESTs available from a particular HERV gives evidence of its activity. However, the noise level in ESTs is larger than the sequence differences within a HERV group, so it may be hard to determine exactly which HERV an EST stems from. We use a hidden Markov mixture model [1] to handle the uncertainty in the EST to HERV matching. The model learns the relative activities of the HERVs from EST sequence data.

We study the expressions of HML2 HERVs in various tissues. The HERVs were automatically detected from the human genome and ESTs matching the HERVs were searched from the dbEST with BLAST. We used eVoc ontologies to divide the ESTs to tissue specific sets. The HMM model was learned separately for each tissue.

Our method finds relative activities of the HERVs in each tissue. There are several HERV sequences that exhibit tissue-specific expression. The expression patterns of these HERVs can later be verified with laboratory methods; by contrast, exhaustive search of active individual HERVs with laboratory methods would be too expensive.

[1] Merja Oja, Jaakko Peltonen, Jonas Blomberg and Samuel Kaski, Methods for estimating human endogenous retrovirus activities from EST databases, BMC Bioinformatics 2007, 8(Suppl 2):S11.