[Taxacom] Morphology vs Molecular

Wed Aug 19 15:41:53 CDT 2009

Alignment seems to be a procedure that imposes overall similarity where
the homology of an individual base is determined as a byproduct of the
overall compromise between the theorized significance of the number of
gaps vs number of substitutions. I think this is one aspect of molecular
analysis where primitive retention cannot be empirically excluded from
the data.

John Grehan 

> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
> bounces at mailman.nhm.ku.edu] On Behalf Of Bob Mesibov
> Sent: Tuesday, August 18, 2009 7:23 PM
> To: TAXACOM
> Cc: Richard Zander
> Subject: Re: [Taxacom] Morphology vs Molecular
> 
> Another thing to be clear about is the meaning of a molecular
character
> when looking at raw sequence data, as opposed to looking at well-
> understood fragments, whole genes or other higher-category entities.
> 
> If you have a widespread sequence of, say, 20 bases with no known
indels,
> you can be very confident that the characters are the *positions* in
that
> sequence, 1-20. That is, the *positions* are homologous. At each
position
> the character state is a base, so for DNA will be A, T, G or C.
> 
> If you have indels, which are very common in most of the widely used,
> longer sequences, two issues arise wrt identifying characters. The
first
> is how you align sequences from different sources, because different
> multiple sequence alignment procedures (whether carried out first, or
as
> part of direct optimisation) can give you different positional
homologies.
> [It was interesting to see in that staphylinid paper recommended by
> Stephen Thorpe that the authors did separate analyses based on Clustal
and
> MAFFT alignments. AFAIK this kind of catholic approach to alignment is
> rare. Most labs seem to pick their MSA method and stick with it.]
> 
> The second issue is how you treat gaps in your analysis after
alignment.
> You can ignore them entirely, and this amounts to character weighting
> because an indel is an evolutionary novelty. Alternatively, you can
treat
> a gap as a fifth character state. Someone more familiar with the
molecular
> phylogeny literature than I am may be able to say how often analyses
are
> done both ways, and the results compared.
> 
> 'Ignore third codon' weighting for coding sequences can be avoided by
> doing an analysis of the amino acid sequence in its entirety. I'm not
sure
> whether enough proteins are known yet to allow AA analyses to be
useful at
> all taxonomic levels. There are also wonderful surprises lurking in
the
> 'proteome'. I used to think (as a non-molecular taxonomist) that
histone
> H3 was a very highly conserved nuclear protein with wonderful
base-level
> variety. A few weeks ago I learned that H3 paralogy is ...um ... a
> problem.
> --
> Dr Robert Mesibov
> Honorary Research Associate
> Queen Victoria Museum and Art Gallery, and
> School of Zoology, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> (03) 64371195; 61 3 64371195
> Website: http://www.qvmag.tas.gov.au/mesibov.html
> 
> _______________________________________________
> 
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
> 
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
> 
> (1) http://taxacom.markmail.org
> 
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here