Phylogenetic Inference from Conserved Sites Alignments

William Noble Grundy
Gavin J. P. Naylor

Journal of Experimental Zoology. 285(2):128-139, 1999.


Molecular sequences provide a rich source of data for inferring the phylogenetic relationships among species. However, recent work indicates that even an accurate multiple alignment of a large sequence set may yield an incorrect phylogeny, and that the quality of the phylogenetic tree improves when the input consists only of the highly-conserved, motif regions of the alignment. This work introduces two methods of producing multiple alignments that include only the conserved regions of the initial alignment. The first method retains conserved motifs, whereas the second retains individual conserved sites in the initial alignment. Using parsimony analysis on a mitochondrial data set containing nineteen species among which the phylogenetic relationships are widely accepted, both conserved alignment methods produce better phylogenetic trees than the complete alignment. Unlike any of the nineteen inference methods used previously to analyze this data, both methods produce trees that are completely consistent with the known phylogeny. The motif-based method, on the other hand, employs far fewer alignment sites for comparable error rates. For a larger data set containing mitochondrial sequences from 39 species, the site-based method produces a phylogenetic tree that is largely consistent with known phylogenetic relationships and which suggests several novel placements.
PDF version