[Taxacom] Morphology vs Molecular

bti at dsmz.de bti at dsmz.de
Wed Aug 19 01:47:27 CDT 2009


Bob,
With increasing numbers of prokaryote genomes becoming available what  
one can do with such data sets is becoming apparent, as well as where  
problems lie.

The question of indels is interesting because it depends on your  
approach. If you base your evaluation on similarity values you have to  
work out whether you include these regions in the analysis - one often  
removes them. However, these could be major evolutionary events with  
much greater effects that single base changes. Does treating the whole  
of the indel as a fifth base do justice to its "significance"?


Brian

Quoting Bob Mesibov <mesibov at southcom.com.au>:

> Another thing to be clear about is the meaning of a molecular  
> character when looking at raw sequence data, as opposed to looking  
> at well-understood fragments, whole genes or other higher-category  
> entities.
>
> If you have a widespread sequence of, say, 20 bases with no known  
> indels, you can be very confident that the characters are the  
> *positions* in that sequence, 1-20. That is, the *positions* are  
> homologous. At each position the character state is a base, so for  
> DNA will be A, T, G or C.
>
> If you have indels, which are very common in most of the widely  
> used, longer sequences, two issues arise wrt identifying characters.  
> The first is how you align sequences from different sources, because  
> different multiple sequence alignment procedures (whether carried  
> out first, or as part of direct optimisation) can give you different  
> positional homologies. [It was interesting to see in that  
> staphylinid paper recommended by Stephen Thorpe that the authors did  
> separate analyses based on Clustal and MAFFT alignments. AFAIK this  
> kind of catholic approach to alignment is rare. Most labs seem to  
> pick their MSA method and stick with it.]
>
> The second issue is how you treat gaps in your analysis after  
> alignment. You can ignore them entirely, and this amounts to  
> character weighting because an indel is an evolutionary novelty.  
> Alternatively, you can treat a gap as a fifth character state.  
> Someone more familiar with the molecular phylogeny literature than I  
> am may be able to say how often analyses are done both ways, and the  
> results compared.
>
> 'Ignore third codon' weighting for coding sequences can be avoided  
> by doing an analysis of the amino acid sequence in its entirety. I'm  
> not sure whether enough proteins are known yet to allow AA analyses  
> to be useful at all taxonomic levels. There are also wonderful  
> surprises lurking in the 'proteome'. I used to think (as a  
> non-molecular taxonomist) that histone H3 was a very highly  
> conserved nuclear protein with wonderful base-level variety. A few  
> weeks ago I learned that H3 paralogy is ...um ... a problem.
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Zoology, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> (03) 64371195; 61 3 64371195
> Website: http://www.qvmag.tas.gov.au/mesibov.html
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either  
> of these methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:   
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>



Dr.B.J.Tindall
DSMZ-Deutsche Sammlung von Mikro-
organismen und Zellkulturen GmbH
Inhoffenstraße 7B
38124 Braunschweig
Germany
Tel. ++49 531-2616-224
Fax  ++49 531-2616-418
http://www.dsmz.de
Director: Prof. Dr. Erko Stackebrandt
Local court: Braunschweig HRB 2570
Chairman of the management board: MR Dr. Axel Kollatschny

DSMZ - A member of the Leibniz Association (WGL)





More information about the Taxacom mailing list