[Taxacom] cladistics (was: clique analysis in textbooks)
Richard Zander
Richard.Zander at mobot.org
Sat Aug 20 12:36:30 CDT 2011
I think taxacomers who lack decisive training in phenetic analysis, which is most of us, figure clustering is grouping by a data matrix that compares one taxon and one variable and then some similarity algorithm. Thus, Sergio is correct that an instant similarity or distance tree is different from a parsimony tree, in terms of what we have been told: i.e. that phenetics and parsimony are different.
On the other hand, I took a tutorial course (3 days) in clustering techniques (didn't learn much, of course) at a meeting of the Classification Socity from the then president of the Society and Pierre Legendre. I asked, ahem, if parsimony was a clustering technique. The two glanced at each other furtively, then opined that indeed parsimony is a clustering technique. Thus, authority says it is.
Yes, parsimony does calculate a bunch of distance trees and selects recursively (I think) the shortest tree because it is NP-complete (NP-hard), i.e., can't complete an exact solution in polynomial time. So...does the fact that we have to do heuristic sampling to get any sort of tree make parsimony not clustering? I think this is what this thread is about.
Surely the product is a distance tree based on shortest transformation set?
* * * * * * * * * * * *
Richard H. Zander
Missouri Botanical Garden, PO Box 299, St. Louis, MO 63166-0299 USA
Web sites: http://www.mobot.org/plantscience/resbot/ and http://www.mobot.org/plantscience/bfna/bfnamenu.htm
Modern Evolutionary Systematics Web site: http://www.mobot.org/plantscience/resbot/21EvSy.htm
-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Bob Morris
Sent: Friday, August 19, 2011 10:53 PM
To: Sergio Vargas
Cc: taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] cladistics (was: clique analysis in textbooks)
On Fri, Aug 19, 2011 at 2:32 PM, Sergio Vargas <sevragorgia at gmail.com> wrote:
"...because clustering can be done (computationally) efficiently
whereas searching for an optimal tree using phylogenetic methods
cannot."
It's fair enough that some or even all biologists might have a usage
of "clustering" that meet all of your explanation, and perhaps even
that this should be agreed to by all of the readership of taxacom. I
wouldn't know. But in statistical pattern recognition and datamining,
not everything called clustering can be done computationally
efficiently. Many techniques those disciplines call clustering are
intractable in the sense that they are NP-hard. Informally, this means
that (with presently understood computational complexity theory),
they fundamentally scale at least exponentially with size of the data
and no algorithm can circumvent that, just as for optimal tree
induction problems. So I can only understand your text as meaning
"...because clustering as meant by all practicing phylogeneticists can
be done (computationally) efficiently...", and that is why you are
prepared to subsequently say that the rest of your explanation "[...]
is so basic I cannot believe I am explaining it".
I do wonder a little whether in fact all practicing phylogeneticist
readers of taxacom understand by "clustering" only tractable
algorithms.
Bob Morris
Robert A. Morris
Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
IT Staff
Filtered Push Project
Harvard University Herbaria
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
On Fri, Aug 19, 2011 at 2:32 PM, Sergio Vargas <sevragorgia at gmail.com> wrote:
> Hi,
>
> >Clustering is clustering is clustering. Group some things together and
> you are clustering - however it is done.
>
> no you are not. Grouping is not clustering, there are many ways to group
> things together not involving clustering. Maximum parsimony, maximum
> likelihood and bayesian analysis are not clustering. It is simply
> incorrect to call to these methods clustering. When you run either of
> the above analyses you are not clustering, despite the result being
> something similar to a cluster. If you could reduce phylogenetic
> inference to clustering everything would be so easy (computationally
> speaking) because clustering can be done (computationally) efficiently
> whereas searching for an optimal tree using phylogenetic methods cannot.
> Taxa are only "clustered" (randomly or sequentially) together to build
> the first tree, afterwards entire topologies are evaluated, taxa are not
> clustered. This is so basic I cannot believe I am explaining it.
>
> sergio
>
> --
> Sergio Vargas R., M.Sc.
> Dept. of Earth& Environmental Sciences
> Palaeontology& Geobiology
> Ludwig-Maximilians-Universität München
> Richard-Wagner-Str. 10
> 80333 München
> Germany
> tel. +49 89 2180 17929
> s.vargas at lrz.uni-muenchen.de
> sevra at marinemolecularevolution.org
>
> check my webpage:
> http://www.marinemolecularevolution.org
>
> check my research ID:
> http://www.researcherid.com/rid/A-5678-2011
>
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of these methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>
--
Robert A. Morris
Emeritus Professor of Computer Science
UMASS-Boston
100 Morrissey Blvd
Boston, MA 02125-3390
IT Staff
Filtered Push Project
Department of Organismal and Evolutionary Biology
Harvard University
email: morris.bob at gmail.com
web: http://efg.cs.umb.edu/
web: http://etaxonomy.org/mw/FilteredPush
http://www.cs.umb.edu/~ram
phone (+1) 857 222 7992 (mobile)
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
The Taxacom archive going back to 1992 may be searched with either of these methods:
(1) by visiting http://taxacom.markmail.org
(2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
More information about the Taxacom
mailing list