[Taxacom] the hurdle for all biodiv informatics initiatives

Stephen Thorpe s.thorpe at auckland.ac.nz
Wed Feb 17 18:03:41 CST 2010


I support any initiative that actually adds useful content on a regular basis, but my enthusiasm is dampened if it is just one of several similar (and expensive) initiatives out there, and if Wikispecies can do the same thing. If I can take Index Fungorum as a "preview" of a "fully populated GNUB", then I don't see any real advantage to the end user over Wikispecies, except that it might take less work to put data in, if it is done automatically somehow? Index Fungorum is a relatively OK sort of thing to have IMHO, because it has useful content, but my real criticism is with other (expensive) initiatives who would simply "integrate" Index Fungorum's data, without adding any content of their own, or would simply create thousands of skeleton pages from an automatic "harvest" of IF's names, and then claim  60% (or whatever) taxonomic coverage  - you know who you are! And I still don't see why we need meaningless identifiers for names (which are already identifiers for taxa)?

________________________________
From: Paul Kirk [p.kirk at cabi.org]
Sent: Thursday, 18 February 2010 12:22 p.m.
To: Stephen Thorpe; Wolfgang Lorenz; taxacom at mailman.nhm.ku.edu
Subject: RE: [Taxacom] the hurdle for all biodiv informatics initiatives

but a fully populated GNUB will ... so will you support this initiative?

Paul

________________________________
From: taxacom-bounces at mailman.nhm.ku.edu on behalf of Stephen Thorpe
Sent: Wed 17/02/2010 21:46
To: Wolfgang Lorenz; taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] the hurdle for all biodiv informatics initiatives


Wolfgang Lorenz seems to be making the same point I was trying to make about identifiers for names. Like Wolfgang, I don't see why they have to be meaningless strings of numbers, when we could just normalise the name string, making it a perfectly good identifier for the (nominal) taxon. For example:

Feronia sodalis
urn:lsid:ubio.org:namebank:6755945
Feronia sodalis LeConte 1848
urn:lsid:ubio.org:namebank:6755946
Feronia sodalis
urn:lsid:organismnames.com:name:475805

The only identifier here should be the normalised name string: Feronia sodalis LeConte 1848

If the numeric identifier somehow encoded a taxonomic/nomenclatural history, linking it to other combinations, then that would be an advantage, but it doesn't, does it?

________________________________________
From: taxacom-bounces at mailman.nhm.ku.edu [taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Wolfgang Lorenz [faunaplan at googlemail.com]
Sent: Wednesday, 17 February 2010 9:46 p.m.
To: taxacom at mailman.nhm.ku.edu
Subject: [Taxacom] the hurdle for all biodiv informatics initiatives

Dear taxacomers,
all those initiatives, wikispecies not less than the alliance of GBIF, EoL,
CoL, etc. are still under preparation, not yet in full operation, so it
seems we should be patient, especially since they take very different
approaches and roles.
What we can make out, at this early stage, is the taxonomic names problem as
probably THE major hurdle for all those projects! Roderic Page's defense of
DOIs for publications is quite understandable to me, but I'm in doubt when
he writes:
>>Lastly, imagine if we had similar services for the other things we care
about, such as taxonomic names ..<

Taxonomic names are not just things like other things, but even if we take
them as abstracted "name objects" only, - cannot we study plenty of examples
out there to see the massive problem?
And, isn't this a major reason why many taxonomists are so skeptical about
what biodiversity informatics was doing so far?
Take the nearctic beetle Cyclotrachelus sodalis (LeConte 1848) for just one
of so many examples: six different generic combinations (objective synonyms)
have been used for it. These six names are listed as separate species in the
following name-aggregators, with a total of 18 (!!) different LSIDs so far:

1.) Original binomen:
Feronia sodalis
urn:lsid:ubio.org:namebank:6755945
Feronia sodalis LeConte 1848
urn:lsid:ubio.org:namebank:6755946
Feronia sodalis
urn:lsid:organismnames.com:name:475805

Subsequent generic combiations:
2.)
Eumolops sodalis Le Conte, 1848
urn:lsid:ubio.org:namebank:1044312
Eumolops sodalis
urn:lsid:ubio.org:namebank:2936230

3.)
Evarthrus sodalis
urn:lsid:ubio.org:namebank:7006280
Evarthrus sodalis sodalis
urn:lsid:ubio.org:namebank:7053291
Evarthrus sodalis
urn:lsid:organismnames.com:name:137618
Evarthrus sodalis LeConte 1848
urn:lsid:organismnames.com:name:4284542

4.)
Pterostichus sodalis
urn:lsid:ubio.org:namebank:5814327
Pterostichus sodalis LeConte 1848
urn:lsid:ubio.org:namebank:5814328
Pterostichus sodalis Lec.
urn:lsid:organismnames.com:name:2319489
Pterostichus sodalis LeConte 1848
urn:lsid:organismnames.com:name:1424845

5.)
Cyclotrachelus sodalis
urn:lsid:ubio.org:namebank:1666855
Cyclotrachelus sodalis
urn:lsid:organismnames.com:name:1437096

6.)
Abax sodalis Leconte 1848
urn:lsid:ubio.org:namebank:548823
Abax sodalis
urn:lsid:ubio.org:namebank:2725681
Abax sodalis Leconte 1848
urn:lsid:catalogueoflife.org:
taxon:de56c742-29c1-102b-9a4a-00304854f820:ac2009

Why do we need such identifiers and who can take control of it???
Instead of machine-only-readable identifiers, which are obviously "out of
human control" in so many examples, we could have perfectly stable, unique
and readable Name Strings for each available name, registered and resolvable
in a future ZooBank:

ZS-Feronia_sodalis
ZS-Feronia_sodalis/Eumolops_sodalis
ZS-Feronia_sodalis/Evarthrus_sodalis
ZS-Feronia_sodalis/Pterostichus_sodalis
ZS-Feronia_sodalis/Cyclotrachelus_sodalis
ZS-Feronia_sodalis/Abax_sodalis

Together with such standardized name strings, Zoobank should store
information that belongs to a name but does not form part of it, like
author+date, page numbers, type information, grammar, etc.
And the idea is that projects like GBIF, EoL, etc. could build upon these
human-readable identifiers by just adding something like an usage instance
number. E.g., GBIF, when it has parsed occurrence records for
"Cyclotrachelus sodalis Lec." and "Abax sodalis" it can assign
human-readable ID-strings, perhaps in the following format:

ZS-Feronia_sodalis/Cyclotrachelus_sodalis#2345
ZS-Feronia_sodalis/Abax_sodalis#324

With such standardized strings it should be much less of a problem for
humans AND computers to know what's in those names. This is not the solution
for ALL problems, of course, but a solid nomenclatural basis could bring us
a huge step forward, IMHO ... or do I miss something?

Best regards,
Wolfgang
-----------------------------------------
Wolfgang Lorenz, Tutzing, Germany
_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these methods:

(1) http://taxacom.markmail.org<http://taxacom.markmail.org/>

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these methods:

(1) http://taxacom.markmail.org<http://taxacom.markmail.org/>

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

P Think Green - don't print this email unless you really need to

************************************************************************
The information contained in this e-mail and any files transmitted with it is confidential and is for the exclusive use of the intended recipient. If you are not the intended recipient please note that any distribution, copying or use of this communication or the information in it is prohibited.

Whilst CAB International trading as CABI takes steps to prevent the transmission of viruses via e-mail, we cannot guarantee that any e-mail or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.

If you have received this communication in error, please notify us by e-mail at cabi at cabi.org or by telephone on +44 (0)1491 829199 and then delete the e-mail and any copies of it.

CABI is an International Organization recognised by the UK Government under Statutory Instrument 1982 No. 1071.

**************************************************************************



More information about the Taxacom mailing list