Linked or associated characters

Fri Jul 8 07:29:57 CDT 2005

Hi Robert,

In the literature on numerical taxonomy (sensu lato - includes cladistics),
users are advised to not use characters that are logically linked and caution is
advised when characters are highly correlated (= associated).  Numerical
correlation/association can be easily checked.

However, a correlation of +/- 1.00 doesn't mean that the characters are
necessarily linked to each other - they may be linked to some other causative
factor which may not be reflected in your data matrix.  Some authors make it a
practice to eliminate one or the other of two (or more) highly correlated
characters to avoid, at least in multivariate analyses, the problem of
over-determination.  But this needs to be considered carefiullu because, in such
analyses, it is the fact that there are are correlations among some variables
that allows us to detect patterns in the data.

I'm not sure how, from a cladistic standpoint, two perfectly associated
characters are dealt with.  My advice is to delete one - it shouldn't change the
tree topology but will affect the summary statistics (e.g., CI, RI).  If two
characters are less than perfectly associated, I would keep both.

These decisions are largely dependent on the knowledge of the person conducting
the analyses; having a clear understanding of the characters and their
relationships is critical.

Cheers,

Dick

Robert Mesibov wrote:

> One thing that's always puzzled me about character analysis, say in
> cladistics, is how you tell whether two characters are independent, or
> instead are linked or associated.
>
> By "linked" I mean inseparably paired in a developmental program or a
> metabolic pathway, so that when character A is in state 1, character B is
> invariably in state 2. By "associated" I mean that if an organism has A=1,
> it's heavily selected to have B=2. It theoretically could have A=1 and B=4,
> but this would be functionally a bad move and the 1/4 forms get eliminated
> before we see them.
>
> It's easy enough to tell that A and B are independent if you can find
> organisms showing A=1 B=2; A=1, B=4; A=3 B=2; and A=3 B=4. But it seems to
> me that this would be a relatively rare situation, and I sometimes see
> analyses (mainly of invertebrate morphological characters, since those are
> the papers I read) in which the characters listed seem to me to be ones
> which are likely to be linked in development.
>
> What happens when an analysis is based on part on non-independent
> characters? How do the outputs of the different phylogenetic software
> programs vary as you "dilute" the information content of the taxon-character
> matrix by adding more and more non-independent characters?
>
> Please note that I ask these questions as a cladistic and
> numerical-taxonomic ignoramus, so if there's a basic reference I can be
> directed to, I'd be grateful for the citation. (Nothing I've looked at so
> far deals with these questions).
> ---
> Dr Robert Mesibov
> Honorary Research Associate, Queen Victoria Museum and Art Gallery
> and School of Zoology, University of Tasmania
> Home contact: PO Box 101, Penguin, Tasmania, Australia 7316
> (03) 6437 1195
>
> Tasmanian Multipedes
> http://www.qvmag.tas.gov.au/zoology/multipedes/mulintro.html
> Spatial data basics for Tasmania
> http://www.geog.utas.edu.au/censis/locations/index.html
> ---

--
Richard J. Jensen              | tel: 574-284-4674
Department of Biology      | fax: 574-284-4716
Saint Mary's College         | e-mail: rjensen at saintmarys.edu
Notre Dame, IN 46556    | http://www.saintmarys.edu/~rjensen