DNA-protein alignment

John Trueman trueman at RSBS.ANU.EDU.AU
Sat Oct 16 18:09:31 CDT 1999

Dear Taxacomers,

Back to TAXonomic COMputation for one moment if I may.

It often happens that I have a PROTEIN alignment I'm reasonably confident
makes functional sense (helixes in the right place, etc) but I want to
apply parsimony and/or likelihood models to the ALIGNED DNA, same
alignment, to better estimate the phylogeny/functional convergence.

In principle it is easy to align a DNA sequence to its protein alignment:
expand the protein character data from 1 position per aa to three positions
per aa then slide the DNA sequence along until what appears under each aa
is one of the triplets which codes for that aa.  However, the genetic codes
contain redundancy and there are three possible reading frames, so this is
not quite a trivial problem.

I've been aligning small numbers of sequences by hand but it is a
cumbersome process. To save me re-inventing the wheel does anyone have or
know of a utility which will
1. input a protein alignment in one of the commonly-used formats (eg,
Nexus, Fasta, Clustal),
2. input the corresponding DNA sequences, along possibly with leading and
trailing non-coding regions and introns,
3. align the second to the first.


John Trueman

John Trueman
Faculties Research Fellow
Bioinformatics Group
Research School of Biological Sciences
Australian National University
Canberra, ACT 0200,  AUSTRALIA

ph: +61 2 6249 4840
fax: +61 2 6279 8525
email: trueman at rsbs.anu.edu.au

Reason is a tool. Try to remember where you left it.

More information about the Taxacom mailing list