More on the 'cladistics' of sequences

Mon Jun 7 15:18:42 CDT 2004

Herb Jacobson wrote:

> I don't see "easier" as objective and I don't see how "degress of
> closeness"
> is objective, let alone "feel confident."
>
> "This sometimes works, sometimes doesn't - depends on what outgroup
> sequences
> you have at your disposal."
>
>  "Sometimes it works" come on, what is is your objective criteria?

You're right Herb, I didn't make that sound very objective at all.  It
has crossed my mind over the last few weeks that 'thinking
theoretically about' and 'actually doing' phylogenetic analyses
approach the same sorts of understandings about phylogeny
reconstruction but from opposite directions. Hence in doing a manual
sequence alignment (or correcting an obviously misaligned set of
sequences produced by a program like Clustal) over and over again, one
always asks oneself "Am I doing this objectively?" - especially when
people who have never done it themselves are leaning over your shoulder
claiming that you are being subjective and asking "How can you possibly
know whether a base goes 'here' or 'there'?" The fact that many many
phylogenetics papers state in the methods that sequences were 'aligned
using ClustalX and then manually adjusted' gives me confidence that
humans can align sequences (make hypotheses of homology) in a way that
seem objective to them - otherwise it would be a dirty secret just
waiting to be exposed by evil dissenters of (semi)modern phylogenetic
methods ;) - which I don't think it is of course...

In writing about 'degree of closeness' and 'sometimes it works', what I
really meant was that I can recognise patterns in my alignment whereby
outgroup taxa share many similar nucleotides in the same position in
the alignment as ingroup taxa. It happens that there will almost always
be taxa in the ingroup that are more similar to the outgroup (i.e.
closer to the outgroup on an unrooted tree), thus I take an iterative
approach to alignment: (i) do a quick alignment first using a computer
program like ClustalX, (ii) construct a phylogeny (any method will do)
based on this alignment, (iii) re-order the taxa in my alignment based
on the branching order in the phylogeny, (iv) use my own innate ability
to recognise patterns (analogous in a way to group recognition in
ordination) and to adjust base positions in the alignment to minimise
character state changes (i.e. if there is more than one place to put a
base, put it in the place where it will be parsimony-uninformative).
Then I'd re-do the analysis, re-order the taxa again based on the new
analysis and repeat this process until I found I wasn't getting
different trees each time I did it, and stop once the minor changes I
make have no effect on the tree I get. By this time, I am looking for
shared characters in the alignment that might turn out to be
synapomorphies when the tree is generated and rooted. I ask the
question "Is there a way that I can re-align this character so that it
is no longer shared by the taxa that have it?". If the answer is "No"
then I'd say that was a good character and I'd get all excited about
it. If the answer is "Yes" then I'd fiddle around some more until I was
either happy that the alignment was as conservative as I could get it,
or alternatively decide to flag that region as 'ambiguous' and exclude
it from the analysis.  Is all this 'fiddling' necessary? Definitely -
anyone who has seen the output of a crude Clustal alignment will know
that even two base-identical sequences can get aligned differently -
especially if one is a different length to the other. So, Clustal makes
glaring misalignments, which are easy (and quite therapeutic,
actually!) to 'fix'. In the end, a phylogeny is only as good as the
alignment it was made from. In making these iterative changes to the
alignment, one finds that some groups of taxa stay together all the
time in the resulting phylogenies, and other taxa move around easily.
The taxa that stay together all the time are generally the ones that
are going to get bootstrap support, and the ones that move around all
the time are the ones that don't, and are often separated from their
neighbours by short (or long) branches.  Sometimes it is the outgroup
taxa that move around (relatively, that is) on the tree, other times it
is some ingroup taxa. One can't put much reliance on the relationships
of the labile taxa, only the stable ones, so there is generally no
conflict in my mind about how 'good' my phylogeny is - it is as good as
the branches that were well supported. My measure of objectivity is how
stable are they in the phylogeny and this is generally reflected in
either the conflict (or lack of) in equally parsimonious trees, or in
the amount of branch support they get, or both.

Cheers, David.

Dr David Orlovich,
Senior Lecturer in Botany.

Department of Botany,
University of Otago,
P.O. Box 56,
(Courier: 464 Great King Street)
Dunedin,
New Zealand.

Phone: (03) 479 9060
Fax: (03) 479 7583

Web: http://www.botany.otago.ac.nz/

Ecology, Conservation and Biodiversity Research Group:
http://www.otago.ac.nz/erg/

Botanical Society of Otago: http://www.botany.otago.ac.nz/bso/