[Taxacom] cladistics (was: clique analysis in textbooks)

Thu Aug 18 20:40:16 CDT 2011

Le 17/08/2011 14:41, John Grehan wrote:
> Pierre speaks, of course, authoritatively on such matters,

John, are you kidding?
"cladists" generally insisted that authoritarianism is not acceptable in 
science
I'd like not to be considered an "authority", this is no argument
I can swear I don't consider you as an "authoritry" of any kind (...I 
couldn't resist this one)

> so it was interesting to find myself in some agreement even though I am pretty much an amateur when it comes to the theory of clustering algorithms.

I am now certain that we disagree about the definition of "amateur"
"amateurs", at least in French, qualifies people eager to learn, 
typically reading and re-reading textbooks (they regularly collect 
them), and proud to use the dedicated jargon the standard way, so that 
they can communicate efficiently among themselves and also with 
professionals whenever they get the opportunity

>> it seems to me that the discussions could be clarified by restricting
>>   the use of "cladist" to name the nomenclatural procedure while using
>> "parsimony" for phylogenetic analysis, the more when "cladistics" as a
>> method covers a range of diverse procedures (unweighted or weighted
>> parsimony with possibly different weighting schemes, compatibility
>> "clique" analysis with possibly different thresholds of selection...)
> Brilliant

not brilliant, basic: it's in textbooks and relevant literature (I mean 
really, I am not just pretending that some possible textbooks... 
blah-blah...)
what is brilliant in my view is the recommendation by Peter Hovenkamp 
that people interested in parsimony analysis should read the relevant 
literature first (you called it "crazy", but I presume it means 
"brilliant" in grehanian special jargon - or "kangaroo" may be, who 
knows? esoterism is unfathomable...)

>> e.g. the debate about Grehanian methods would be formulated in terms
>>   dispensing with the use of the rather vague term "cladistics" -
>> grehanistic algorithm =
>> 1) perform prior character selection by applying a strict compatibility
>> criterion: characters homoplastic in outgroups play no role in the
>> ingroup analysis, only the clique of compatible characters are retained
>> prior to ingroup detailed analysis [except for e.g. thick enamel because
>> it's just so]
> 'Compatible' here is defined as those character states that are compatible with the criterion of being uniquely shared within the ingroup?????

"compatible" is not here defined, it's defined in textbooks and relevant 
literature, it's basic "cladistic" jargon for "compatibility analysis", 
also known as "clique analysis"
your approach is not really "clique analysis" by any standards, because 
you are not analysing your complete data set (all taxa, all characters) 
to find the clades supported by the largest clique of mutually 
compatible and homoplasy-free characters (CI = 1 for the largest clique 
on this topology); rather, you are pre-defining ingroup versus outgroups 
and then you select the clique of characters fitting this elementary 
outgroup/ingroup topology,
so you have selected the series ("clique") of mutually compatible 
characters _possibly_ uniquely derived in your selected ingroup - but 
they are not necessarily "uniquely derived" in the details (see below), 
because they can still appear as homoplastic inside your ingroup, and 
this you will know only after your will have performed your ingroup 
analysis using surviving characters, when you observe a suboptimal CI < 
1 (Sergio Vargas also explained at length how enlarging your ingroup 
automatically changes your retained data set; hence possibly the 
topology connecting the taxa of your previous ingroup inside this now 
enlarged ingroup)

> Please explain the 'just so' rejection of thick enamel.

I mean that it's just your opinion (I bet we agree on the meaning of 
"opinion", you very frequently use this term)
your opinion is that thick enamel is good despite being homoplastic in 
outgroups, hence it does not fit your general "potentially 
homoplasy-free" selection criterion, but you retain this character 
anyway - just so (maybe it's what Richard Zander called the part of 
"intuition" in your "method")
you acknowledged in a recent post that you could reject this character 
for this reason, so it seems that by moments you are conscious of this 
internal incoherence in your analysis (besides the other incoherence due 
to your character selection criterion depending on the range of your 
ingroup, as noted by Sergio, and above, and below)

>> 2) perform ingroup analysis by applying unweighted parsimony [not
>> compatibility analysis] so that now homoplastic characters can play
>> their role in defining subclades inside the ingroup, while this is
>> denied for outgroups [according to the goose / gander principle].
> What homoplastic characters?

those characters that may finally appear to be not mutually compatible 
and homoplasy-free inside your ingroup, they may show some homoplasy in 
your ingroup, so that you have no possible topological resolution with 
an optimal CI = 1 (despite your pre-selection of characters "restricted 
to the ingroup"); any possible topology inside your ingroup requires 
extra steps, hence some homoplasy (should I define extra steps?... it's 
in all textbooks)
are you OK for "homoplasy"? (convergences, parallelisms, reversals...)
is your model of character evolution rejecting the possibility of 
reversals? (this could explain your character selection procedure, but 
not your internal ingroup analysis procedure: I call this incoherence, 
just like Sergio)

> I am not tied to any particular analysis. I've used parsimony because it's widely accepted readily accessible and presents a straightforward clustering procedure (with caveats).

congratulations for using the term "parsimony"
now, in jargon, "clustering" is reserved for overall similarity 
analysis, i.e some phenetic clustering  (read Sergio, and textbooks)

now you wrote: "Yes I am asserting that 'my' form of cladistics is 
necessarily
correct - or at least more correct or better than some others." (down below)
this is contradictory with "I am not tied to any particular analysis" 
(above)
...deliberate play of moving targets and shell games?... sheer 
confusion?... you should know better than me

if you are not tied to a particular analysis, this means that there is 
no single "straightforward" procedure (see my supposedly "brilliant" 
comment above), but strangely enough you are using no one of the usual 
known procedures (by the way, what do you mean by "straightforward", 
algorithmically speaking?... not to question what "caveats" precisely...)

I suggest that you use one of the classic procedures (because you don't 
care finally, if I understand well), the more universally used for 
morphology being by far unweighted parsimony - i.e. treating your 
ougroups like you treated your ingroup: no more incoherence

just put all characters in the data matrix, find the optimal unweighted 
parsimonious topology, root between outgroups and putative ingroup (if 
possible, so you are testing the coherence of your ingroup by the way: 
read PAUP user manual as an elementary textbook for this)

> In principle I would be happy to use three-item analysis, although right now I do not have access to a program to do that.

amazing... ney, fascinating...
may I respectfully suggest that you read the relevant literature this 
time and try to make up your own mind before you do that?
e.g. in "Cladistics": Deleporte, De Laet (better, shot it dead), Farris 
and Kluge, and quite recently Farris again (highly recommendable in my 
view, against pure classificatory formalism and fancy history of science)
this could save you from professional suicide...
(a variant of 3ia appeared as an involuntary reinvention of phenetics, 
great fun - I mean true phenetics = clustering whatever character data 
set on the basis of overall similarity among objects, not grehanian 
esoteric jargon concerning character selection)

>> in summary: grehanistics = compatibility analysis for character
>> selection outside the ingroup [with exceptions] and unweighted parsimony
>> analysis inside the ingroup with the immediate consequence that the
>> retained data set, hence possibly the topology of the ingroup, will
>> mechanically change without biological reasons according to the scope of
>> the analysis (larger or narrower ingroup)
> Don't understand what is meant about how the retained data set will mechanically change without biological reasons with respect to larger or narrower ingroup.

I don't understand that you don't understand this elementary logical point
you wrote yourself in a recent post that enlarging your ingroup would 
make you accept more characters, given you own criterion - if you have 
fewer outgroups and larger ingroup, hence likely you will reject fewer 
characters because of being homoplastic in outgroups;
at the very limit: retain only one species as outgroup, put all other 
species in your ingroup,  and you save all the characters you previously 
rejected: do you get the point?
- not brilliant, elementary - should be in all textbooks, provided that 
they would bother at all with exposing your utterly incoherent method

>> as already noted by John Grehan himself in a recent post this logical incoherence can easily be corrected by applying unweighted
>> parsimony throughout, or possibly compatibility analysis throughout (for
>> amateurs only)
> Where this is done to include all characters, whether or not they are restricted to the ingroup, to my mind results in an analysis of overall similarity rather than cladistic derivation.

listen: given the same data (characters and taxa), suppose your ingroup 
is now enlarged so that only one taxon is left as outgroup, then all 
characters are necessarily "restricted to the ingroup" (to adopt your 
esoteric jargon) - and you can also consider all the intermediary 
possibilities: the larger the ingroup, the smaller the outgroup, and the 
more characters are saved from rejection by your own criterion - do you 
get it?
so, enlarging your ingroup would make you mechanically accept more 
characters, and then you should logically be qualifying your own 
approach as "analysis of overall similarity" given your own definition, 
because you would have introduced these supposedly bad "pheneticiating" 
characters into your analysis, hence supposedly making it "phenetic" 
(when it is parsimony analysis in fact: basic unweighted parsimony for 
the ingroup);
of course your character selection procedure has strictly nothing to do 
with "analysis of overall similarity", as already underlined by Sergio, 
and as it was repeatedly explained to you on this list, an incredibly 
high number of times, and since years now... apparently with no effect 
on your capacity to grasp the concept of overall similarity analysis (= 
nothing to do with character selection)

a reflexion of yours about your (implicit) model(s) of character 
evolution could certainly help you to better understand this point
what model supports your character selection procedure? (= rejecting 
homoplastic characters from the analysis)
what model supports your ingroup analysis procedure? (= using unweighted 
parsimony, that keeps internally homoplastic characters playing their 
role as possible synapomorphies at lower levels, what you explicitly 
rejected when applying your character selection procedure)

your shift in optimality criteria implies that evolutionary processes 
are not the same in outgroups than in your selected ingroup
then enlarge your ingroup, all thing being equal otherwise, and consider 
what happens to your data matrix, and to evolutionary processes for the 
now included taxa and characters... and possibly to your topology, or at 
least to the relative support of your clades...

> I'm ok with everything being analyzed as a supermatrix of all life, although one would still end up 'selecting' character states found in 'life' rather than 'non life', or on reflection maybe according to Pierre's perspective one could include 'non life' features as well (e.g. various structural arrangements).

very well, but here we are not at all dealing with these well known 
limits of outgroup rooting (= what homologies would you retain among 
life and non-life?...), so this argument is a rhetorical diversion
here we are dealing with elementary precautions anybody can take when 
treating a very restricted branch of the tree of life
varying the phylogenetic range ("level") of your analysis is such an 
elementary precaution; with generally little consequence if you are 
using unweighted parsimony throughout your analysis, outgroups included 
(read Sergio Vargas on this, and all textbooks), but with possibly 
drastic consequences when using your so particular method: because the 
range of your ingoup delineation has direct impact on your procedure of 
character selection

the simpler way seems that you give up inventing your own method and 
drowning deeper and deeper in your logical contradictions (so many 
shots, only two feet...), and rather use a standard parsimony method - 
because you really don't care, hey?...

>> grehanistics is an original method, exposed in no "cladist" textbook,
>> John Grehan's claim to be "cladist" is obviously not informative in
>> itself, and the question "is grehanistics cladist or not cladist" has
>> little interest (what is at stake, after all?)
> Some say I am a cladist, others say I am not. No sleep lost.
we certainly agree that this is no problem in itself (who cares...)
we can also agree that you read no textbooks really (I insist that you 
should)
I'd like we could agree that your using the words "cladist" or 
"phenetic" and now "analysis of overall similarity" a very unusual way 
has detrimental consequences on the intelligibility of your writings for 
the remaining of the scientific community, when it's so easy to use 
words just like in textbooks (like real amateurs very wisely like to do)
I thought that TAXACOM was particularly devoted to devising rules for 
non ambiguous scientific communication... of course it requires an 
effort: reading and critically analysing the relevant literature, 
understanding other people's points of view - but this stands at the 
heart of the scientific enterprise, at least I consider I'm just paid 
for this.

Pierre

> John Grehan
>
> Le 17/08/2011 04:49, Kenneth Kinman wrote:
>> Dear All,
>>         Gee, I am a fan of cladistic analysis (if done correctly), but I
>> never thought that ANY form of cladistics was "necessarily" correct (but
>> a lot that seemed better than John's, although admittedly I have seen
>> some that were worse, even at higher taxonomic levels, and thus more
>> detrimental and regretably sometimes accepted by far too many).
>>          As for some people having used "refuted" as a synonym of
>> "rejected", whoever they might be, I really doubt that they are
>> restricted to users of US language (as opposed to English language as a
>> whole or even other languages).  In any case, I predict an exclusive
>> orangutan-hominid clade will continue to be both refuted and rejected.
>> It has very clearly been "rejected" by the vast majority, but a small
>> minority still insists that it has not been "refuted".  Anyway, I'm not
>> going to lose any sleep over that one, but I am admittedly still
>> bothered by the question of whether chimps clade exclusively with
>> gorillas or with hominids.  Hopefully we will see some more informative
>> papers on that subject in the near future.
>>         ------a user of "US language",
>>                              Ken Kinman
>> --------------------------------------------------------
>> John Grehan wrote:
>>        Yes I am asserting that 'my' form of cladistics is necessarily
>> correct - or at least more correct or better than some others. And I
>> realize that I am sticking my neck out on that and perhaps setting
>> myself up for a fall - in which case the orangutan evidence will not
>> doubt be refuted (and I am not using that term as a synonym of rejected
>> as often occurs in US language).
>>
>>
>>
>> _______________________________________________
>>
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> The Taxacom archive going back to 1992 may be searched with either of these methods:
>>
>> (1) by visitinghttp://taxacom.markmail.org
>>
>> (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

-- 
Pierre DELEPORTE
UMR6552 EthoS
Université Rennes 1
CNRS
Station Biologique
35380 PAIMPONT
tél (+33) 02 99 61 81 63
fax (+33) 02 99 61 81 88