Dichotomy

Tue Mar 28 15:14:06 CST 1995

Mike Dallwitz wrote:

"A contrast is often made between dichotomous keys
(those using only 2-state characters) and polychotomous keys (those allowing
characters with 3 or more states). In your sense, these are both
dichotomous."

If this is true, then I have never seen a dichotomous key and
all keys would be polytomous (not "polychotomous").  For example, a key to
Jurinea used by Pankhurst as an example of dichotomous key includes no less
than 4 states for the character shape of capitula (subglobose vs hemispherical
at line 9 and cylindrical vs obconical at line 13) and includes other
characters with 3 or more states (basal leaves, achene length, ratio
pappus/achene, etc.)

The next issue raised by Mike is more important: should
we record all the terms that are attached to a particular character, and will
this approach lead to a separate state for every taxon? The second question is
easily answered: no, it doesn't (I have checked). For example, many flowers
are white, which means they share the same character state.

I believe the
answer to the first question is yes, absolutely. We have described 1.4 million
species out of the 10 to 100 millions that exist. As we move to record all
these new species (before the year 2000 of course), it is obvious that we are
going to find new states. How do I know which one to record? Easy, I record
them all. How do I know which ones are important? If I am working in a genus
where color is used to differentiate species, and I find a specimen with an
hitherto unrecorded color, obviously this character is important (in fact, one
could argue that any character selected for a traditional identification key
is important and that new states probably point to new taxa). Looking at this
question from the point of view of overall similarity, we need to consider as
many characters as possible, with whatever state they have. Finding a new
state will lower the overall coefficient so that our unknown will not be 100%
similar to any of the existing species. This does not mean that it is a new
species (the taxonomic decision will have to be made by the human user, of
course), but it still is a very important piece of information.

All this
illustrate another difference between the **tomous keys which give THE answer
and a similarity program which takes all the available evidence and offers a
list of possible answers for the user to ponder.

It is not necessary to
redefine a character when you are just adding a new state (green, yellow,
mauve). Changing `color' to `hue' and `saturation'; or `whether uniformly
colored', `background (or only) color', `secondary color' is something else.
This is no longer adding a state but modifying the character itself. This was
not was I was talking about. Mike also worries about changes between the
boundaries of states. Certainly, this is a problem with any elimination method
but it does not affect similarity provided the character is defined (and
treated) as an ordinal (instead of a nominal) character.

We seem to agree
about the error tolerance: this is for controlling the remaining taxa, it
doesn't give "graceful" degradation.

Number of differences used by INTKEY as
a dissimilarity measure: why not divide this number by the number of
characters used? Surely, 2 differences when you use 2 characters is not the
same as when you use 20? Also, why use only 0 and 1? There is a lot of room
between "identical" and "completely different".

How do I calculate the "best"
character? That depends on what you mean by "best": most reliable, most
discriminating, easiest to record, or what?

I fail to see how entering
RELIABILITY COLOR,0, is going to affect the character:
 #6. flowers/

 1. white/
           2. blue/
           3. red/
In other words, how is the
computer going to know that this character is describing a color?

Renaud
Fortuner
fortuner at math.u-bordeaux.fr