Positivism in evolutionary science

Thu Dec 4 07:15:36 CST 1997

On Thu, 4 Dec 1997, Tom DiBenedetto wrote:

> If there is no signal, there is no pattern to explain. And there are
> plenty of instances of consistent signals shining through lots of
> noise.

        Not so.  There is always pattern as long as there are
        varying states among taxa.  And you could list 1,000,000
        "confirming instances" of the number of times what looks
        like signal comes through, it wouldn't be relevant to the
        whether the next data set published has phylogenetic signal
        or not.  There is no generalization from induction.

> An organism-wide consistent pattern demands explanation. I dont know
> of any which have been proposed beside descent.

        Differential lineage sorting can do the strangest things.

> >      Parsimony
> >     methods can do no more than provide an overall picture of
> >     the APPARENT signal in the distribution of states among
> >     organisms.  Occasionally, under a restricted set of
> >     conditions, MP will summarize true evolutionary
> >     signal - and sometimes they will not.
>
> What criterion do you use to distinguish when it is and when it
> isnt? How do you know if it ever is?

        By seeking patterns that (1) actually represent SIGNIFICANT
        hierarchy, and that also (2) look like those that don't
        mislead MP and other methods.  This doesn't guarantee accuracy,
        but it sure helps - sometimes a great deal.

>
> >     When homoplasy is localized, the hierarchy in the matrix
> >     is distorted.
>
> The hierarchy in the matrix? That is the pattern in the data,
> mahn,,it is not distorted by the various processes which could have
> given rise to it.

        Pardon?
>
> >     When homoplasy is not rare, the hierarchy in
> >     the matrix is displaced noise - which with varying states
> >     MUST assume (by definition) some type of "pattern".
>
> How do you know that? There are cladograms out there that strike me
> as quite believable even though they have lots of homoplasy. How do
> you know they are not true signal?

        My statement didn't involve cladograms so I don't see the
        relevance of your counterpoint.  I "know" that when
        true convergence occurs (i.e., homoplasy), and when
        varying states occur among organisms, the homoplastic
        states will tend to displace signal.
> >     To say that the pattern in the matrix is reflective of evolutionary
> >     relationships is a MAJOR assumption.
>
> I know some preacher types who would agree with you there!  :)

        Maybe, but this point exists independently of them.

> >     The reason why there
> >     will always be some finite degree of "pattern" in discrete
> >     matrices with varying states is that the matrix is finite;
> >     the state must be distributed this way or that.
>
> Look, would you agree that historical descent does pattern the
> distribution of characters? If yes, then the question is whether the
> noise is SO ubiquitous that it can drown out that pattern. I am sure
> that you can do a simulation in which non-descent similarities are
> programmed in, such that few descent similarities remain. If
> evolution
> really worked that way, then how do you propose that any method could
> be accurate? How can you model a set of processes when you can never
> know which factor, or set of factors are yielding the pattern?

        First, the problem of noise need not be ubiquitous for it
        to concern us.  If it is misleading for MY data, I'm
        concerned.  Second, I think it is worthwhile to determine
        if, for any and all discrete matrices, it looks as is
        noise is intruding enough to cause concern.  THAT step
        is the first step that then leads to your next question.
        What can be done about it now under intense research.
        In some cases, it's simply a matter of unlucky taxon sampling.
        There are a dozen or so techniques that I'm working on
        to answer this question, but not in the context of modelling
        as you have suggested.
>
> >     When the pattern or implied hierarchy in a matrix has not
> >     been caused by informative shared geneaological processes,
> >     parsimony will go right on and summarize the major emergent
> >     pattern.
>
> true
>
> >      I have focused on independent means to determine
> >     when and if the patterning in the distribution of states
> >     among organisms looks like it (1) indicates a significant
> >     degree of hierarchy ,
>
> which is not the issue here,,,,
>
> >     and if (2) the amount and type of implied hierarchy is the type that does not
> >     mislead algorithms that lead to trees.
>
> why dont you just take up some morphological studies? :) Seriously,
> the more the field of molecular phylogenetics proceeds, the more
> problematical this system of endlessly cycling states seems. Sequence
> data is inherently unhierarchical, and I sense that it really will
> turn out to be the least useful data source we have.
> Of course I admit that there is lots of it!

        These are the types of generalization that can get us into
        trouble.  Sequence data CAN be hierarchical, and it
        very often is.  Sometimes it isn't.  We should know about
        it either way.  And sometimes adding more data can make
        problems worse, especially with localized homoplasy
        (long edges).
>
> >     Parsimony will reflect the preponderant or major pattern - and
> >     it is hit or miss as to whether that pattern is, in fact,
> >     predominantely reflective of shared geneaological relationships,
> >     or something else.
>
> Hit or miss? Sorry, but that does imply the existence of causal
> factors which can produce patterns consistent on the same scale as
> does descent.

        Yes, I suppose it does.  But we inherited the descent
        with modification paradigm from Darwin, not just the
        "descent paradigm".  The causal factors in constrast
        then are "descent" and "modification".  How much
        of it is the first, and how great the latter will
        effect the amount of signal that we find in a matrix.
>
> >      That something else includes a mix of signal caused by
> >     shared genealogical,
> >      noise caused by common genealogy (organisms inherit homoplasy for
> >     deeper relationships from their progenitors),
> >      high rates of anagenetic evolution between speciation events,
> >      non-binary fission of species,
> >      non-independent anagenesis among characters in the same lineage,
> >     inheritance of ancestral polymorphism,
> >     concerted evolution (among lineages),
> >     high frequencies of convergence (adaptive or random)
> >     relative to the frequency of non-convergence,
> >     unequal distribution of homoplasy among lineages,
>
> Nice list. Here is not the time or place to argue each one. Some are
> legitimate sources of error, many I would argue are not necessarily
> confounding at all. But in general, unless many of these factors were
> all working in a concerted way, I dont  think you would get false
> patterns on the scale of the descent pattern.

        It's a matter of degree... any one of these, in sufficient
        dose, can inject enough noise to effect even morphological
        patterns.
>
> >     In light of all these possible sources of noise, the central tenet
> >     of phylogenetics should not be that evolution is, in general
> >     or in detail, a parsimonious process.
>
> OOhhh, low blow James. I *know* that you are sufficiently well-read
> in the subject to know that parsimony is not justified by an
> assumption of "parsimonious evolution", and never has been.

        I know that folks say so, but they certainly act as though
        MP is consistent.
>
> >      It is, as I indicated in
> >     the second post in this particular thread, positivistic to presume
> >     that parsimony will result in an accurate accounting of which
> >     characters correctly indicate homology that indicates shared
> >     relationships, rather than the alternative, namely, that
> >     parsimony may be mislead by "something else".
>
> Parsimony approaches are not  focussed on "accuracy", for that is
> unknowable. They are focussed on finding the order which is prevelant
> in the data.
> What standard external to pattern analysis (or the process models
> which explain patterns) do you propose to turn to in order to assess
> the accuracy of phylogenetic results?

        There's the rub.  I want to perform pattern analysis,
        just not with MP alone.

James