NATURE to save taxonomy!

Thu Jun 6 14:37:34 CDT 2002

> How is the data of taxonomy any less (or more) accessible than the data of
> ecology or molecular biology?  In any field, one has to be ready to do the
> library work needed to background a research project.

The data of taxonomy is _uniquely_ suited to being included in a centralised
repository unlike the hodgepodge that makes up ecological publications. The
things most relevant to other disciplines coming out of the sciences of
taxonomy and systematics are the taxonomies themselves, and the
relationships they describe.

No other discipline (short of pure biochemistry) allows for its data to be
easily and quickly disseminated in a standardised form (description of well
recognised characters, type specimens stored in a particular location (and
now documented visually on the web for *cheap* access to all who can afford
web access (or who formerly never would have considered looking at type
specimens) which rapidly includes an increasing portion of the world's
scientific community... as opposed to the old system of mailing specimens
(limiting access) and/or visiting (at great cost) herbaria & the likes (what
are animal collections called?)).

> Taxonomy has been declared a "dead" science not by those who realize its
> worth, but by those who dismiss it as "stamp collecting."  Taxonomy has been
> damaged because we as taxonomists have not insisted on that worth, and have
> not made our case to the public--and hence the funding agencies.

It is incumbent on the discipline as a whole to make its work relevant to
other researchers. What I mean by relevant is that it has to inform their
work, *and* most importantly in our highly computerised world, to be
ACCESSIBLE electronically at the drop of a hat.

For example,

using http://www.google.com 's search engine I am able to find out in
seconds information on nearly any subject imaginable that would previously
have taken a person months, or perhaps a life time 15 years ago. I can hop
on the web and find a long-lost friend from childhood -- here in Canada
using canada411.sympatico.ca I could, using their last name and initials
(provided they weren't too common) find out their phone number, where they
lived, and use http://maps.yahoo.ca or http://www.mapquest.com to visualise
a map (down to the individual blocks) of exactly where it is they live.

And, as an example of real-life detective work, the recent capture (and now
conviction) of a former Black Panther accused of high-jacking a plane in the
70s occurred because detectives decided to turn to a web search engine to
see what they could pull up. Within minutes of getting the idea they had the
person's address and place of employment (on a case that had been dead for
decades).

Canada411.sympatico.ca and maps.yahoo.ca do not impact on taxonomy (unless
you're looking for people's phone #s, addresses and need directions on how
to get to them ;), but they demonstrate how the internet (through HTTP) is
*the* most important tool in our professional lives as purveyors of
information (and can only be expected to grow in significance (even my
mother is now using the world wide web ;)). But, the internet has a *lot* of
noise and that can cause you to lose the signal. Central repositories like
GenBank, EMBL (they are the most famous in the biological community) et al.
are invaluable in keeping the noise away from the signal.

Perhaps the most important question I can pose is this:

What state of advancement would molecular biology, and more specifically
molecular-based taxonomy be at WITHOUT GenBank? If you want to compare your
sequences to those of other organisms, let's say a plant and an
archaebacteria, all you have to do is go to GenBank, plug in your
parameters, and you get your desired sequences *and* literature references.
                                                     ^^^^^^^^^^^^^^^^^^^^^

I can see people losing some autonomy, and being apprehensive of such a loss
by having a central repository for information in a discipline, BUT, the
gains from having centralised, RAPID and copyable access, regardless of your
expertise as a taxonomist or a literature sleuth to information, outweigh
any such losses.

Having multiple servers across the web, and keeping all of the copies in
synch is easy, and has been done for years quite successfully, even before
the advent of HTTP on the world wide web. MIT's software repository for
Macintosh (and all its mirrors, updated nightly) is such a repository that
I've been using since 1993:
<http://hyperarchive.lcs.mit.edu/HyperArchive.html>.

Sincerely, Eric Dunbar.