Positivism vs Realism

Mon Dec 15 10:25:57 CST 1997

On Mon, 15 Dec 1997, Tom DiBenedetto wrote:

JLW:>
> >       No, I haven't - because I don't know what the result would
> >       mean.
>
> I'll tell you what it would mean. It would provide an indication of
> whether morphological datasets are ever, or very often of
> questionable significance (by your standards); a question that you
> seem constantly to raise.  You ask me if I would put forward a tree
> which is judged to be indistinguisahble from one generated by random
> data; as if this were a pressing concern. And I would like to know
> how often you have found it to be a real concern - in morphological
> studies. As you know, morphological studies are done on datasets
> which represent many years of careful study; on homology hypotheses
> which are well tested in the biological realm. As you also know, many
> morphological systematists tend to see these "tests against
> randomness" to be a little besides the point. Now for molecular
> sequences, where there is not much, beyond alignment, to be
> studied,,there is a general sense that perhaps these tests have
> meaning. I think these are very interesting questions.

        I recall that randomization tests for signal revealed
        that most morphological data sets had signal - and that
        the explanation offered was that morphologists
        were doing a good job.  There are some data sets
        where the randomization tests pass a particular
        morphological matrix on the yes/no question of
        significant hierarchical character covariation -
        but for which some tree-independent tests report
        distinct sources of a pathological (misleading)
        incongruence.  Nevertheless, if morphologists
        accept your view that their data are worth
        "testing" by parsimony, it is perplexing why
        they should eschew tests based on significance
        testing.  If they have done their job, then
        there data will pass - and the more critical
        tests their data pass, the more confident they can
        be that the data won't mislead phylogenetic inferences.

        Funny you should mention alignments, T.  I'm moving
        on to that question...

> >        The test is designed for the specific case, one
> >       matrix at a time.
>
> so what? I am just wondering what proportion of datasets, considered
> one at a time, are found to be presenting insignificant results.

        What would that tell us, really?  What if morphological
        data sets were found to be generally wanting, the reaction
        would be "who needs this new stuff?" rather than "oh -
        ok, now I've learned something more about my data than
        I knew previously..."

        If morphological data sets were almost always found to
        be pathology-free, then the reaction would be "we're
        doing fine - who needs this new stuff?".  My point is such a study
        wouldn't necessarily be informative about the next data set
        constructed, or the next, or the next...  That's the type of
        induction Popper fought so hard against, and I rarely
        lend any weight to unqualified generalizations.  For instance,
        the t-test requires that a certain distributional assumption
        apply (normality) - but it is also appreciated that the test
        tends to be robust to such violations.  Analysts nevertheless
        insist that the assumption be tested before hand, because
        the violation can be problematic.  So the general statement
        "the t-test is robust to violations of the normality
        assumption" is not a license to ignore that assumption in
        any particular case.

> >        I have found published morphological
> >       data sets for which the test reveals sources of incongruence,
> >       including long edges, but I don't find the number of
> >       instances a very interesting question.  It could be high,
> >       or low.
>
> Why not?? It speaks to the efficacy of the procedures. Something you
> certinly seem interested in discussing a lot.

        Which procedures?  Collecting data?  But see my point about
        generalization above...

> >        A question you didn't ask that I find more
> >       interesting is how many morphological data sets can I find
> >       for which the application of the tests improves the
> >       degree of congruence by pinpointing sources of noise -
> >       but that work is underway.
>
> Improve the degree of congruence by pinpointing noise - I wonder what
> that means, other than what parsimony does. Parsimony of course,
> finds congruence. Noise is incongruence. You are imposing some new
> standard here, right? You are finding a reason to dismiss character
> matches even if they are congruent (i.e. even if they are retained by
> parsimony - else how would you be doing anything different).
> Interesting,,,,,,what if they are real though...?

        It's not dismissive - it's pointing to characteristics and
        interactions among the distribution of states among characters
        (relative to that which can be expected by chance) that bear
        a more critical look than that provided by the indication
        of congruence on the parsimony tree.  In short, it throws
        up a red flag to those apparently robust hypotheses of homology
        that require further investigation - and it also sends out
        a warning that the hypotheses of homology involved cannot
        be trusted to behave themselves in the exercise of
        tree-based inquisitiions of congruence.
>
> >        We can't expect all regimes of character
> >       and taxon sampling to yield matrices that are not misleading.
>
> Nor can we expect statistical regularities to inform us when
> particular data points are misleading,,,,,right?

        Pardon?  If your are asking whether statistical methods have
        limitations, the answer is a very loud yes - the first
        principle of methods of inference is that ALL methods
        of inference have limitations.  Most of the time, when
        they are made explicit, then they can be turned to great
        advantage when appraoched with a skeptical mind.  Hence
        the recent move to test for cases when the assumptions of
        phylogenetic inference (leading to the standard bifurcating
        tree) are (apparently) violated.  IMHO, fields can stagnate
        when their methods of inference are accepted uncritically
        and their limitations are unrecognized or denied.  An
        example of this is Popper's warning against misplaced
        faith in formalism and convention.  But limitations are rarely
        fatal, especially when they lead to improvements.

        James LW.

_______________________________________________________________________________

  \  /   /    \  /           JAMES LYONS-WEILER           ______________
   \/   /      \/                                        |..............|
    \  /       /                                         |..............|
     \/       /              DOCTORAL PROGRAM IN         |..............|
      \      /               ECOLOGY, EVOLUTION, AND     |...***........|
       \    /                CONSERVATION BIOLOGY        |..*****.......|
        \  /                                             |.******.......|
         \/                  1000 VALLEY ROAD/186        |********......|
    ______________           THE UNIVERSITY OF            --------------
   | will perform |          NEVADA, RENO
   |  statistical |          RENO, NEVADA 89512-0013
   | phylogenetic |
   | analyses for |         "(Biology) is not religion; if it were, we'd
   |    food      |          have a much easier time raising money."
    --------------                       -Leon Lederman

                EXPLORATORY ANALYSIS OF PHYLOGENETIC DATA
                RASA 2.1 SOFTWARE FOR THE MAC
                Download @
                http://loco.biology.unr.edu/archives/rasa/rasa.html
_______________________________________________________________________________