corroboration
R. Zander
bryo at PARADOX.NET
Tue Aug 25 10:15:23 CDT 1998
John Trueman wrote:
> Dear Richard,
> Gee, that's a good question. I think agreement will be hard to find, and
> this is because some take the view that corroboration has to do with an
> increase in probability while others use corroboration in the sense Popper
> used it. Popper took great pains to explain that corroboration cannot be
> equated with probability -- or in his words with a 'probability calculus'.
Thank you, John, for responding to my query about corroboration. A
re-examination and re-construction of the foundations of systematics was
started by cladistic phylogeneticists and continued by statistical
phylogeneticists. I hope to demonstrate that optimality criteria are improperly
used for reconstruction of historical events.
How about "to make more certain, to strengthen" as a working definition for
corroboration (see dictionary)? Joint probability does this. If a witness to a
historical event is joined by another, that's corroboration. Is probability
involved? Of course. Popper is wrong if he thinks corroboration does not
involve human expectation.
Example of genuine corroboration: Take a 6-sided and a 12-sided die. Have a
confederate select one at random and roll it until a "1" comes up. You would
guess the six-sided die was the one used since that's the best bet (a 1 comes
up 1/6 of the time with that die vs 1/12 of the time with the dodecahedron).
The confederate rolls the same die. If a 1 comes up, you would have a
strengthened expectation that this is a hexagon, because the probability of 2
"1"s is 1/6 * 1/6 with a hexagon, and 1/12 * 1/12 with a dodecahedron.
Probability is involved here. But not that there is no contradictory evidence
here against the hexagon as a hypothesis.
> Ch X of 'The Logic of Scientific Discovery' begins with a discussion of
> this contrast between those who would "describe theories as being neither
> true nor false, but instead more or less probable" with those who seek
> corroboration. "In my view", says Popper, "the whole problem of the
> probability of hypotheses is misconceived. Instead of discussing the
> 'probability' of a hypothesis we should try to assess what tests, what
> trials, it has withstood ... In brief, we should try to assess how far it
> has been 'corroborated'."
Withstanding trials does not strengthen an argument, it remains just an
argument. One uses the best hypothesis because it is easiest to test,
predictively. This is not the case with retrodiction, where expectation of near
certainty is necessary to have a "reconstruction." Abundant evidence and an
absence of reasonable alternative explanations is my measure of phylogenetic
reconstruction. In the literature, few cladistic or statistical studies have
introduced reasonable re*constructions.
>
>
> In my (JT's) few forays into the philosophy (if you like to call it that)
> of phylogenetic reconstruction I have always used 'corroboration' in
> Popper's sense. To my mind, we can never say that a tree hypothesis is
> 'true'. We only can say:
> 1. that it is the best we could come up with using some specified data:
>
> By 'best' I mean according to some tree-comparison criterion, eg, parsimony
> or likelihood. Both these are reasonable goals: other things equal, we
> would prefer the most parsimonious tree, other things equal, we would
> prefer the tree which is most likely.
>
> 2. that *after* the hypothesis was constructed we subjected it to
> such-&-such *critical* tests, and in so far as it did not fail those tests
> it is corroborated.
>
Again, "best" is relative. Suppose you have a dozen (or a hundred) reasonable
phylogenetic hypotheses that remain after parsimony analysis has eliminated all
that are way too many steps too long. One of them is best by some criterion,
but of low probability. For what would you use this best hypotheses, given that
there is high probability that some other hypothesis is true? Do we NEED to pay
attention to the best hypothesis?
> Here, the key words are 'after' and 'critical'.
>
> Re After: A hypothesis cannot be corroborated using the very data from
> which it was constructed. In this I disagree with certain well-known
> cladists who see 'corroboration' of a parsimonious tree merely in the fact
> it is parsimonious. These cladists argue that "the most parsimonious tree
> is the least falsified tree", meaning it is the tree least falsified by
> homoplasy. Out of the set of all possible trees it is the tree least in
> need of protection from falsification by the addition of ad-hoc hypotheses.
No, no. I completely disagree. Eliminating reasonable trees because they are
not the shortest is addition of ad-hoc hypotheses that they are unreasonable
because they are not the shortest.
Example: Suppose you have two suspects in a criminal investigation. There are
two witnesses who say Suspect A did it, and one that says Suspect B did it.
Suppose the number of witnesses is directly proportional to evidence of guilt.
Do we hang Suspect A? Remember that Suspect A has minimum falsifiability
(easier to falsify evidence against Suspect B), maximum likelihood (2 out of 3
witnesses), maximum parsimony (simplest hypothesis), and maximum posterior
probability (2/3)\(2/3 + 1/3) = .66. Of course we don't hang him. Now suppose
we have more witnesses, 2 more that say he did it and 1 more that says, no,
Suspect B did it. Again the optimum hypothesis is that Suspect A did it and
optima from both data sets are the same! But there is no real change in
probability, since the additional evidence against A is accompanied by
additional evidence against B. There is no corroboration by two or more data
sets given the same "best" hypothesis when evidence remains contradictory. This
is the case in most phylogenetic analyses.
> >From this true statement they derive a false conclusion, "the most
> parsimonious tree, being least falsified,is the best corroborated". They
> ignore that on a scale of corroboration-refutation every other tree stands
> refuted by having had insufficient ad-hoc assumptions assigned to it but
> the parsimonious tree has had assigned only just sufficient ad-hoc
> assumptions to ensure its bare survival. We might say figuratively it is
> more corroborated than the other trees, but the absolute level of its
> corroboration is precisely zero.
>
> My view is this: First, by our analysis of the data we create a hypothesis
> that the taxa are related in some specified way. From that point on this
> tree hypothesis becomes open to corroboration or refutation. Let us now
> genuinely seek to refute our hypothesis, and to the extent we have tried to
> do this but have failed, the hypothesis has gained corroboration.
Failing to be falisified does not corroborate anything. Even if a hypothesis is
strengthened by addition of evidence for it but not against it, it may remain
weak.
>
>
> Re Critical: The least critical test of a tree hypothesis which I can
> imagine is to demonstrate that yes, this tree is the 'best' tree by our
> prescribed tree selection criterion and using our original data.
I agree, but "best" is not a test.
> At best
> this would corroborate the hypothesis "that we indeed found the most
> parsimonious (or most likely) tree". More critical tests can be designed
> by adding new data and showing that the estimate is not changed.
Not so. More contradictory data that support the same optimum hypothesis at the
low level of probability simply support that low level of probability. There is
also more evidence for suboptimum trees.
> These
> 'more data' can be more characters, more taxa, or something in the nature
> of a logical or probabilistic consequence of this tree but not of others.
> For example, If the hypothesised tree would imply 'this species occurs in
> Australia' and the rival tree(s) would imply 'this species does not occur
> in Australia', and we look for the species in Australia and we find it, the
> first tree is corroborated and the others are refuted by this observation
> of the predicted consequence.
We suddenly switched from contradictory data to non-contradicted data. I agree,
we can get corroboration of hypotheses with data for that is unaccompanied by
data against. (Suppose the data for occurence in Australia was accompanied by a
statement from another expert saying, no, those identifications were wrong.)
> Of course, in practice all we can expect
> most of the time are probabilistic statements: the probability of the
> species occuring in Australia will differ depending whether tree 1 or tree
> 2 is true. Our test will offer corroboration acording to the different
> probabilities of observing the given consequence if one or the other tree
> is true but will not offer corroboration in any absolute sense.
del here
Do changes to the taxon set change the tree? Darwin's hypothesis tells us
> the 'true' tree should be impervious to the addition or deletion of taxa, a
> false tree may not. We cannot always add new taxa to our analysis but we
> can at least drop taxa sequentially using a taxon-jackknife technique. If
> the estimated tree is the historically correct tree all that should happen
> is each branch gets pruned then put back. We might summarise our results
> into a jackknife consensus tree to show which parts of our tree survive
> this attempt at corroboration.
Remember that jacknifing and bootstrapping in phylogenetic reconstruction means
(1) you use maximum parsimony to get a shortest tree (or trees), (2) you
resample the data and apply maximum parsimony to see if subclades appear that
are the same as those in the shortest tree. Note that suboptima are ignored
totally. Given that nature is parsimonious, but not optimally so, an adequate
subsampling procedure would be to (1) find the shortest tree and all trees,
say, 2 steps longer (depending on your evaluation of the characters), then (2)
in all resamples, check all subclades in all trees 0-2 steps longer than the
shortest to check support. You should find that subclades that are not in the
shortest tree are well supported. Resampling using optimality criteria simply
is a complex and obfuscatory way to ignore suboptimal hypotheses.
>
>
> Do changes to our character set change the tree? We might try a character
> jackknife. We might try resampling from the available characters as in a
> nonparametric bootstrap: which of the nodes are supported and which have no
> support (which are corroborated and which are refuted) by this test? If we
> have an explicit model we may try a parametric bootstrap technique.
del here
>
>
> There are many ways of corroborating trees. None of them make the tree any
> more or any less probable.
Corroboration requires strengthening a hypotheses. Only additional information
unaccompanied by contradictory information does this, and it makes the tree
more probable. Corroboration alone does not make a reconstruction, only high
probability, usually involving lack of reasonable alternatives, does.
> (PS: Re your comment about Bremer support: The difficulty with using raw
> Bremer support as an index of corroboration is the same as using raw branch
> length. How do we know, for a given case, whether a Bremer support of
> x-steps is impressive? We must have some null model against which to
> compare.)
Good Bremer support is not corroborative of anything since nothing is
strengthened (unless you assign the species group supported a low prior
probability of monophyly). I don't know what a null model might be in this
case...no support? What is parsimoniously null for Mammalia? Asteraceae? Given
that we are now refering to parsimony studies, in which probability is not
assigned numbers but trees merely grouped as not unreasonably long, I figure
that one must use a combination of taxonomic familiarity with the characters
and how contradictory the evidence is to see if one has a potential
reconstruction ("accepted classification") or not. Only relative lack of
reasonable alternative trees is the measure of success in parsimony
reconstruction of phylogeny. Since there are no absolute guidelines in
parsimony analysis, one might choose to say that groups that do not have
reasonable alternative topologies among all clades 0-3 steps longer than the
shortest seem good bets for probabilistic reconstructions in light of
evolution.
--
Richard H. Zander
Curator of Botany, Buffalo Museum of Science
1020 Humboldt Pkwy, Buffalo, NY 14211 USA
bryo at paradox.net voice: 716-896-5200 ext. 351
More information about the Taxacom
mailing list