data sharing

Hugh Wilson wilson at BIO.TAMU.EDU
Mon Dec 7 09:28:01 CST 1998


On  5 Dec 98 at 23:46, Dave Vieglais <vieglais at UKANS.EDU> wrote:

> 1. Audit trails.
> For publicly accessible information, auditing should not be required, and in
> fact, would be extremely difficult to implement in a reliable manner...

I think the point was made by someone that most specimens carry
annotations that constitute an inherent 'audit trail' archive, at
least for relevant info.  If funding becomes available, the next
addition to the herbarium specimen input system available at:

http://www.csdl.tamu.edu/FLORA/input/inputsys.html

will be an option to create an image of the specimen label and any
annotations present. These images could provide, for clients with an
interest, a direct view of the base data.   This option is, however,
complicated by the 'endangered' problem in that specific (below
county level) location data would be made available.  However, since
taxa of this type (in Texas 28 of ca. 6k vascular plants) represent a
tiny subset, it is a manageable problem via some sort of
pre-processing step using name as a flag.  Since there is no problem
with county level mapping for critical species and we - at this point
in time - don't have state level maps based on vouchers, the whole
issue seems like a 'non-problem' to me.

> 2. Specimen based information systems.
> Any distributed information system being developed using the WWW as the
> backbone (ie. the http protocol) is likely to fail.

Anyone using Alta-Vista or any other web-based search engine can
quickly demonstrate that this is not true.  Web technologies are
pre-adapted to process and display distributed data and one aspect of
this - full text indexing - provides a simple, fast, non-structured,
and fully open option.  An experimental application for text and
graphic display of multi-herbarium specimen data is available at:

http://www.csdl.tamu.edu/FLORA/ftc/ftphsb.htm

Development has not required imposition of standards or data
structures beyond creation of a simple data exchange format:

http://www.csdl.tamu.edu/FLORA/ftc/ftcffld4.htm

 via consensus among those working to computerize their collections,
i.e., those working with the base data *control* the interaction.

> There are however, alternative protocols which have been designed
> specifically for distributed access to information resources using the
> internet.  One solid example is the ANSI/NISO Z39.50 standard for
> information retrieval (www.loc.gov/z3950).  It has been used extensively in
> the bibliographic community for a number of years....

We have estabished that specimen data development and expression
systems are not comparable to the phone book.  Application of this
ancient 'standard' brings up the library card catalog as a model.
Are biological collections data and - more important - biological
collections and staff structured in a way that will allow full
adoption (large and small facilities) of this standard and its
associated (pre-web) infrastructure.  Since this involves
machine-to-machine interactions that must be conducted across a
network that is becoming more congested every day, one wonders if
this potential solution - which certainly appears to be popular with
the U.S. Federal Government (NBII, ITIS, NSF, etc.) will function to
link data resources (biodiversity collections) that differ, in many
fundamental ways, from the established model (libraries).

> In summary, the technology for the distributed access to specimen data is
> available and will be publicly available early 1999.  Of course, one may
> only access information that is stored in electronic format and so it is
> strongly suggested that efforts are directed towards the far greater problem
> of capturing voucher data and storing it in reliable database systems.  Once
> the data is captured, it can be remotely accessed.  Easily.
>
No argument here, except to suggest that fully web-based technology
is available *now* in the public domain and has been for some time
and also that it has been *employed* to create functional prototype
systems that could be scaled to include national and international
participants.

Hugh D. Wilson
Texas A&M University - Biology
h-wilson at tamu.edu (409-845-3354)
http://www.csdl.tamu.edu/FLORA/Wilson/homepage.html




More information about the Taxacom mailing list