Assumptions and reliability of solutions

Mon Mar 14 16:20:34 CST 2005

The difference between a cladist and a non-cladist can be summarized in the
following manner:

We want modern cladistic and statistical phylogenetic methods using
molecular data to solve problems discovered during traditional studies.

Many phylogenetic studies assert they have solved problems with new, copious
data and new ways of dealing with them.

But, there are unaccounted assumptions associated with analyses using
maximum parsimony or likelihood, or Bayesian Markov chain Monte Carlo
software.

I've gathered together, for an article I'm writing, about 35 assumptions
commonly ignored in modern analyses. Molecular studies are correct and
reliable only if there is unambiguous sequence alignment, results robust to
all reasonable evolutionary models, all software implementations agree, no
change in results with different outgroups or ingroups or genes or amount of
data, no sample error, identifications are all correct, reagents are not
contaminated, gap costs are fully justified, and, oh, lots more. Some
assumptions are obvious and occasionally addressed in the literature, some
are inapplicable for some genes, some are cryptic and poorly understood even
as problems, some are only minor problems.

We have only 5% probability to work with if we accept a 95% probability as
good enough to assert that a branch arrangment is reliable. The probability
that the arrangment is correct must be multiplied by the probability that
each and every assumption is true.

Suppose of the 35 assumptions, 10 are relevant (or less than totally
correct) and each is 0.999 correct (one out of a 1000 that a branch
arrangement of interest (as some solution to a taxonomic or evolutionary
problem) will be different if the assumption is not true. The reliability
measure given in the cladogram must be then reduced by 1 percent (10 times
1/1000).

Suppose you have a branch arrangment presented by the software as 100
percent reliability (say, given 100% Bayesian posterior probability). Up to
ten assumptions at 1/200 chance of being incorrect reduce the reliability to
up to 95 percent. More than 10 assumptions at 1/200, or if 10 assumptions
are more than 1/200 chance of being incorrect, reduce the reliability to
less than 95%.

The difference between the cladist and the non-cladist is an apprehension of
the former that the product of the probability of all assumptions being
correct is always more than 95 percent (more towards 100 percent), and the
decision of the latter that it is less than 95 percent.

If the product of all assumptions is generally less than 95 percent, then,
generally, statistical phylogenetics has never reliably solved a problem.

In my opinion, doubtless problems have been solved, but I'd like to see such
solutions explained with reference to all relevant assumptions, otherwise we
don't know which problems have been solved and which solutions are reliable.

Assumptions have always been the bane of scholars, and traditional
systematists struggle with their own set of assumptions, yet the assumptions
of modern methods of phylogenetic analysis seem to be so easily addressed
and so critical that it is amazing that they are sloughed off witht the
phrase "conditional on the data and model."

I think the anger of non-cladists is that the onus of dealing with
problematic assumptions is laid on them.

______________________
Richard H. Zander
Bryology Group, Missouri Botanical Garden
PO Box 299, St. Louis, MO 63166-0299 USA
richard.zander at mobot.org <mailto:richard.zander at mobot.org>
Voice: 314-577-5180;  Fax: 314-577-9595
Websites
Bryophyte Volumes of Flora of North America:
http://www.mobot.org/plantscience/bfna/bfnamenu.htm
Res Botanica:
http://www.mobot.org/plantscience/resbot/index.htm
Shipping address for UPS, etc.:
Missouri Botanical Garden
4344 Shaw Blvd.
St. Louis, MO 63110 USA

-----Original Message-----
From: Thomas Lammers [mailto:lammers at UWOSH.EDU]
Sent: Saturday, March 12, 2005 9:22 PM
To: TAXACOM at LISTSERV.NHM.KU.EDU
Subject: Re: [TAXACOM] Latin names versus scientific names [was: So much
for nomenclatural stability]

----- Original Message -----
From: Karl Magnacca <kmagnacca at WESLEYAN.EDU>
> What is it about cladistics (and phylogenetics more broadly) that
turnspeople into savage beasts?<

Do you mean, what  gets non-cladists so sore at cladists?  I'll tell you:
their insistence that unless you do things their way, you are not doing
science.  That is the most awful imprecation one can hurl at a scientist --
that what you are doing is Not Science.  Nothing else could cut to the quick
so terribly.

Tom Lammers