[Taxacom] taxonomic names databases
Stephen Thorpe
stephen_thorpe at yahoo.co.nz
Thu Sep 1 18:38:25 CDT 2016
I suggest that a crucial issue here is whether or not it is a good idea for these databases to be based on ANYBODY's expertise! They ought simply, I suggest, to be tracking the primary literature using defined protocols which are constant across all taxa. They ought, I suggest, to be designed to be verifiable against the primary literature, not simply "taken on trust". As soon as you involve an active taxonomist, the database becomes a potential platform for them to favour, promote and protect their own taxonomic opinions outside of a peer reviewed context. What we want are experts at tracking and making sense of primary taxonomic literature, whatever groups are involved.
Stephen
--------------------------------------------
On Fri, 2/9/16, Tony Rees <tonyrees49 at gmail.com> wrote:
Subject: Re: [Taxacom] taxonomic names databases
To: "Nico Franz" <nico.franz at asu.edu>, "taxacom" <taxacom at mailman.nhm.ku.edu>
Received: Friday, 2 September, 2016, 11:22 AM
Hi Nico, all,
I have to take issue with
Nico's main point here which seems to be that a
database with a higher level of residual
errors, that can be corrected by
"anybody", may be preferable to one
with a lower level but is under the
control
of a "gatekeeper" so to speak, who has sole
editing rights. In my
experience, at least
for the major systems with a track record of
scientific scrutiny and continuous effort to
improve, the latter tend to
be much more
reliable than the former: for example why would I not defer
to
Bill Eschmeyer's expertise for
information on extant fishes, Paul Kirk's on
the fungi, Geoff Read for Annelida, and so on?
If I find errors or
inconsistencies in their
system's content I simply alert them and, 9 times
out of ten, receive a prompt and courteous
reply and relevant action as
well as
appreciation for spotting the error. I do not want carte
blanche to
edit their systems and they would
probably not appreciate it either!
In any event, no such system is ever perfect
and one would be wise to
separately verify
any data item considered "crucial" to a planned
publication etc. All databases have disclaimers
about potential residual
errors, one simply
has to make a judgement about which are more or less
trustworthy of fit for a particular intended
use, and where to set the bar
beneath which
it is simply better to ignore a particular data system as
a
source of sufficiently trusted
information. In reality most "aggregators"
of such data take the best sources they can (a
subjective decision) and
then hopefully,
either have a proactive policy of detecting inherited
errors - such as inter-dataset comparisons and
investigation of
discrepancies as revealed,
going back to the original literature, and
numerous internal data integrity checks, or at
least be reactive to
improvements as
suggested by others. At least that is what I aspire to -
and recognise that it will never be perfect,
but hopefully still a lot
better than no
equivalent product (hence "Interim" as the first
word in the
name of my project, IRMNG).
Just my 2 cents, as ever,
Best - Tony
Tony Rees, New South Wales,
Australia
https://about.me/TonyRees
On 2 September 2016 at 03:52,
Nico Franz <nico.franz at asu.edu>
wrote:
> Not all of this
discussion is adequately captured if we do not make some
> qualitative or relative distinction
between data quality and trust in data.
>
These two are clearly related but can nevertheless have
different pathways
> in our data
environments and point to different means for resolution.
>
> My sense is that in
the following situation, many of us will not have to
> hesitate for long to decide which option
is preferable.
>
> 1.
A dataset with 99 records that are "good", and 1
that is "bad" (needs
>
"repair"), and to which I have no direct editing
access *in the system*
> where that
system is designed to give me that access and editing power
and
> -credit.
>
> 2. A dataset with 80 records that are
"good", and 20 that are "bad", but
> where the system design is such that I
have the right to access, repair,
> have
that action stored permanently (provenance), and accredited
to me.
>
> The first
dataset is of better quality, but the design tells me that
it is
> unfixable by me. Do I feel
comfortable publishing on the 100 records?
> Actually, not really. Is the act of
someone with access fixing that 1
>
record for me a genuine solution? Also not really, because
"good" (quality)
> is often a
function of time, and with time certain aspects of good
quality
> data are bound to deteriorate,
and so the one-time fix does not operate at
> the problem's root.
>
> The second dataset is
of worse quality, but in some sense it just tells me
> what I already know about my
specimen-level science, i.e. that if I am lazy
> or not available to oversee the quality,
then there might be issues. I may
>
decide to fix them, or not, depending on the level of
quality that I need
> for a particular
intended set of inferences I wish to make. In either
case,
> that is my call, and I will get
it to the point where I do feel comfortable
> publishing. The design of the second
system facilitates that, and *that* is
>
why I trust more, not because it has better data.
>
> So then, at the
surface this may sometimes look like a discussion about
> data quality only. It is not. Too many
aggregating systems are systemically
>
mis-designed to (not) empower individual experts while
preserving a record
> of individual
contributions and diversity of views. Acceptance of a
> classificatory system, for instance, tends
to be a localized phenomenon,
> even in a
regional community of multiple herbaria, for instance.
Nobody in
> particular believes in a
single backbone. This failure to design
>
appropriately primarily affects trust, and secondarily
quality, more so
> over time. A great
range of sound biological inferences are still possible.
> But so are better designs.
>
> Cheers, Nico
>
_______________________________________________
> Taxacom Mailing List
>
Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> The Taxacom Archive back to 1992 may be
searched at:
> http://taxacom.markmail.org
>
> Injecting
Intellectual Liquidity for 29 years.
>
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be
searched at: http://taxacom.markmail.org
Injecting Intellectual
Liquidity for 29 years.
More information about the Taxacom
mailing list