data sharing
Una Smith
una.smith at YALE.EDU
Fri Dec 4 15:38:13 CST 1998
I (Una Smith) wrote:
>> As I understand the term, "audit trail" has nothing to do with policy
>> decisions,
Peter Rauch wrote:
>On the other hand, I would argue that to make important (environmental,
>for example) policy decisions based on data for which the reliability or
>history is unknown and/or suspect, can be risky, and that the existence
>of audit trails, to allow for some degree of enhanced confidence that
>the data used are (still) valid (or at least, unchanged), is useful. To
>that degree, audit trails and policy _are_ related.
No argument there. I should have said that an audit trail is not related
to policy decisions in the way that Hugh Wilson suggested (in alluding to
Web transaction logs). The essence of an audit trail is the documentation
of provenance of every iota of data in the record. I expect the data I
work with to have fully documented provenance, and I work hard to document
the provenance of the specimens and data that I collect.
A casual user has little need to see the audit trail, might get the wrong
idea from it, or misuse it, and most won't be able to make much use of it.
A research user, however, is in effect a curator. And anyone who curates
data (and whose integrity has been established) should have access to the
audit trail.
Perhaps I'm just a busy-body, but I want to have access to the entire
collections records of specimens I study, and to be able to curate those
records to improve the value of my own research. I don't expect to be
permitted to alter the database personally, only to submit corrections
and have them reflected on the database. Unfortunately, sometimes the
impulse to "hide the mess" is so over-developed that the only people who
ever see the entire database record are the persons who enter the data
and the database administrator. Even worse, sometimes the database
consists of nothing beyond "clean" records that are the result of decades
of undocumented editing. Most of us maintain little or no audit trail
on our own databases (beyond a paper trail for the source data), but when
we're curating legacy databases that we expect to persist indefinitely,
the standards of responsible curation are significantly higher.
>> and there is a VAST amount of data to audit.
>Indeed.
It is not uncommon for the audit trail to be longer than the "data" part
of a record. For instance, consider the business of assigning an age to
a fossil locality. The official record often says just "76 Ma" or
"Cretaceous", but at least a carefully written paragraph, even a page or
more, is needed to document the stratigraphic evidence and inferences
used in assigning this age.
Transcription of legacy data from written records to computer databases
is NOTORIOUS for discarding precious contextual information, such as the
color of ink or type of handwriting. You might not know who wrote some
comment, but you can tell what other comments that same person wrote, by
the handwriting and/or instrument used, or by the type of paper used, or
some other clue.
Una Smith Department of Ecology and Evolutionary Biology
Yale University
una.smith at yale.edu New Haven, CT 06520-8106
More information about the Taxacom
mailing list