[Taxacom] a looming data conflict crisis in bioinformatics?
Stephen Thorpe
stephen_thorpe at yahoo.co.nz
Sat Nov 20 16:43:59 CST 2010
Orville, you will never EVER convince the world that a flying machine is
even remotely possible ...
>if 90% of the raw data in question are incorrect or fraudulent
well, 90% is a bit of an exaggeration, and it is not so much "incorrect or
fraudulent", but just unverifiable and based on trust (and not easily flagged as
problematic in closed edit databases)
anyway, the wiki point I made was rather secondary, the main point being that
for checklists and things based on trust, it is just too easy to come up with
implausible ad hoc ways of reinterpreting bad data as if it were good data
for example, in a certain current checklist which will be a data provider for
serious "official" databases and feed into GBIF, et., etc., the author lists
Didymocantha flavopicta McKeown, 1948, without comment, as an endemic N.Z.
species, and fails to mention the endemic D. picta Bates, 1874. Actually,
Didymocantha flavopicta McKeown, 1949 was a replacement name for the Australian
species D. picta McKeown, 1948, which had never previously been reported from
N.Z. These are the facts of the case, but what are we to conclude? It is obvious
to me that the author stuffed up and thought D. flavopicta was a replacement
name for the N.Z. species, but can I prove it absolutely?? No. There are always
some remotely possible ad hoc ways to save the author here ... maybe the type of
D. picta Bates is in fact some other N.Z. cerambycid to what it has been thought
until now to be (and a junior synonym thereof), and the species hitherto called
D. picta Bates in N.Z. is in fact the species now called D. flavopicta, whose
type is a mislabelled N.Z. specimen, not from Australia at all, and the
Australian D. flavopicta sensu all previous authors is something else entirely!
An extreme example perhaps, but illustrative of a general point. For primary
taxonomy, it may never be possible to always require verifiability, but
secondary checklists lacking verifiability are extremely problematic and
unnecessary, but they keep comin' ...
Stephen
________________________________
From: Doug Yanega <dyanega at ucr.edu>
To: TAXACOM at MAILMAN.NHM.KU.EDU
Sent: Sat, 20 November, 2010 10:34:28 PM
Subject: Re: [Taxacom] a looming data conflict crisis in bioinformatics?
Paul Kirk wrote:
>Stephen,
>
>You will never, ever, convince anyone that the future of
>biodiversity information management is by using the 'wiki system' -
>nothing more that a digital equivalent of a piece of paper available
>on the internet. If you need convincing, listen to the inventor of
>the web at the TED
>http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html and
>let us all know why you think he is wrong this time.
I think I can anticipate Stephen's response here, and the point is simple:
Insisting that all we need is more raw data is meaningless if 90% of
the raw data in question are incorrect or fraudulent. The end result
is going to be awfully, awfully confusing.
"The trouble with the world is not that people know too little, but
that they know so many things that ain't so." - Mark Twain
Just consider the battle of two memes:
"Obama is a Muslim" gets 1,090,000 Google hits, but
"Obama is not a Muslim" gets only 89,000.
When the truth is swallowed up by lies, letting some computer
algorithm tell you what to believe on the internet is just asking for
trouble. I'm not so sure Tim is thinking clearly here, unless he can
devise an algorithm that can infallibly detect lies. And, much as you
might hate to admit it, Wikis are very good at filtering out liars,
ignoramuses, and crackpots - and the more people that contribute, the
better that filtering becomes. If you don't believe that, and think
that wikis are "nothing more that [sic] a digital equivalent of a
piece of paper" then you really, truly do NOT understand how wikis
work.
Sincerely,
--
Doug Yanega Dept. of Entomology Entomology Research Museum
Univ. of California, Riverside, CA 92521-0314 skype: dyanega
phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
http://cache.ucr.edu/~heraty/yanega.html
"There are some enterprises in which a careful disorderliness
is the true method" - Herman Melville, Moby Dick, Chap. 82
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
The Taxacom archive going back to 1992 may be searched with either of these
methods:
(1) http://taxacom.markmail.org
Or (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom
your search terms here
More information about the Taxacom
mailing list