[Taxacom] ZooBank Progress

Paul Kirk P.Kirk at kew.org
Sun Apr 28 01:59:09 CDT 2013


see below for my comments, with a ... leader

________________________________________
From: taxacom-bounces at mailman.nhm.ku.edu [taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Francisco Welter-Schultes [fwelter at gwdg.de]
Sent: 28 April 2013 00:25
To: Richard Pyle
Cc: 'TAXACOM'; Robert Whitton
Subject: Re: [Taxacom] ZooBank Progress

Thanks a lot once again Rich, for putting so much detailed knowledge and
skills into this project.

Let me select some of your onlist questions.

Link(s) to the original source from ZooBank:

> Should it include all pages that the
> name appears on, or just ones that have nomenclatural relevance?

Only those that have nomenclatural relevance. And only the first page(s).

> In the
> latter case, how would "nomenclatural relevance" be defined?

If one page: Only the first page of the very occasion where the name was
mentioned and equipped with something that made this name available.

... at that point where it is clear that the author intended to formally introduce a new scientific name (i.e. a Code governed event); this would exclude names in the title, abstract, introduction, a tree etc

If several pages: these other pages must bear high nomenclatural relevance
(the identity or spelling of the name would be different if that other
page would not exist).

... the spelling can be correct (Code compliant) on the page identified above or correctable - it matters not imho in identifing the page

We have special cases where a name was mentioned, and on another page in
the same work the description was given and the name was NOT repeated -
and where that other page was published at a later date. In that case the
name was mentioned at the first date and made available at the second
date. Here both pages should be given.

... in most cases the description or diagnosis (as required by the Code) immeditaely follows the name but it may be on the next page. If the description or diagnosis is on a discontinuous page and the name is not mentioned again the page with the name on is still the relevant page but if the name is mentioned again (with the description or diagnosis) I would determine that this is where the author intended to formally introduce the name. If the name and description or diagnosis were published on differend dates (even if these are in the same journal/book) these are separate publications and should be treated as such, Code compliance being determined based on this.

> Here's a question for people on this list:  In such cases, should both
> pages
> be displayed as thumbnails?

I would say yes, this would certainly be useful.

... I would say no, as I am of the opinion that it is always possible to 'fix' one page as where the name (or the first letter of the name) was intended by the author to be formally introduced. OK, it may in some cases be somewhat arbitrary but if we all agree that's OK ... isn't it.

> Is there an upper limit to how many pages can
> be displayed for a single name?

My personal experience points to an upper limit of 3 pages.
In Philippe Bouchet's Index Rocroi of 30,000 generic names of molluscs I
saw some instances with 4 pages given. But in those cases which I verified
I came to the conclusion that not all citations were necessary. I never
saw a case with 5 or more necessary pages.

... never say never :-)

> Is there a clear policy for what
> situations
> call for linking to multiple pages?

A clear policy would be useful. In the AnimalBase team we checked about
50,000 names manually at the original sources and developed some internal
policies or guidelines for this purpose.

- the usual case: If the new name was mentioned, and on the same page the
description was given, or a bibliographical reference to one, or a link to
a figure was given, then cross-link to that page.

- Cross-link only to the first page (also if the description extends on
various pages, with or without interruptions).

- If the description started several pages previously and the name was
mentioned only at the end of the description, then do not cross-link to
the first page, but to the page where the name was mentioned, at the end
of the description.

- If you cross-link to one single page, whichever page you select: the
name must have been printed on that page. Never cross-link to one single
page if the name was not mentioned on it.

- If the name was only mentioned in the title of the article and nowhere
else in the article, then cross-link to the title page.

- If the name was mentioned in the title and inside the text, then
cross-link to the page inside the text.

- If the name was mentioned at one occasion and it is unclear to the
non-insider where the description was (or bibliographical reference or
figure), then cross-link to both pages.

- In cases of genera, if the name was established with a description at
one instance, and if it is unclear to the non-insider where the species
were included elsewhere in the text, then cross-link to both pages
(description of genus, and included species). If other species were
included once again surprisingly on another page, then add also that third
page (Swainson 1840 presented such a chaos).

- If the name was mentioned in the text without description, and if the
name was also printed on a plate with a figure, then cross-link only to
the plate, not to the text page.

- If the name was given in two different spellings on two different pages,
then cross-link to those two different pages. If three different spellings
were given, then cross-link to those three pages.

- If a new name was mentioned on both a text page with description, and on
a plate with a figure, then cross-link to the text page. You can also add
the plate, particularly if the plate was not mentioned in the text.

- Avoid cross-linking to plates if this is not necessary.

- Avoid cross-linking to plates if no names were mentioned on them.

- If you cross-link to two separate pages (a plate is a page in this
sense), then NEVER cross-link to pages that were published at different
dates if the name was made available at the first date.

- If the name was mentioned on one page with a reference to a description
or figure that was published in the same work on another page at a later
date, then cross-link to both pages, and explain this in a comment.

In AnimalBase we have the option to add verbal comments, this makes it
easier to handle weird and complex cases.

Maybe we have some more internal rules, but I think these are more or less
the most important ones.

The first rule covers more than 95 % of the cases.
My experience is that the rare ugly and difficult cases accumulate among
those where researchers tend to search for help. So for us, it's worth to
keep an eye on them.

All the best
Francisco



> Many thanks, Francisco!
>
>> In 2003 we applied to the German Science Foundation to do link on page
>> level the Linnean names with the original sources. They allowed is to
> create
>> AnimalBase, but denied our request for a page level linking, arguing
>> that
> this
>> old literature is not of any interest any more today...
>> 10 years later you are doing exactly this and the results looks great! I
> am
>> really happy that you finally did it. This is very useful.
>
> Good to know!  The intention of GNA is to build these cross-links very
> broadly, and in a way that's very easy to access -- either by downloading
> a
> copy of the cross-link index, or by using simple services to access it in
> real time.  Back in the earliest days of ZooBank I wanted to establish a
> few
> examples of this sort of cross-linking functionality, and one of the
> cross-links I established was with AnimalBase. For example, if you go to
> the
> record for Linnaeus 1758 (which is a bit slow to load, because there are
> nearly 5000 names that are loaded for this publication), you can see the
> list of icons under the ZooBank LSID for all the cross-links:
> http://zoobank.org/References/2C6327E1-5560-4DB4-B9CA-76A0FA03D975
> One of them is the record for AnimalBase.
>
> What I would *love* to do is establish all the cross-links between
> AnimalBase Authors, literature and names (I think you and I discussed this
> once -- I believe we were on the Tube in London at the time....)  That
> way,
> any link to BHL that is made via ZooBank, would simultaneously be made for
> AnimalBase.  Thus, you could either have a copy of the cross-linked index
> on
> your server that allows a page-link to BHL to appear on AnimalBase
> whenever
> someone establishes one on ZooBank; or we could make a simple service on
> GNUB (where the cross-links actually reside) such that your page could
> call
> the service for a given record of yours, and get back all the cross-links
> to
> other databases that GNUB knows about (including BHL, and others).  For
> example, if you passed in your AnimalBase reference identifier "4" to this
> service (as well as an indication that the "4" applies to an AnimalBase
> Reference identifier), then the service would give you back something that
> looks like this:
>
> ------------------------------------------
> Domain:       Amphibian Species of the World References
> Identifier:   2794
> Link:         URL:
> http://research.amnh.org/vz/herpetology/amphibia/?action=bib&id=2794
> ------------------------------------------
> Domain:       AnimalBase References
> Identifier:   4
> Link:
> http://www.animalbase.uni-goettingen.de/zooweb/servlet/AnimalBase/home/refer
> ence?id=4
> ------------------------------------------
> Domain:       Biodiversity Heritage Library Title
> Identifier:   542
> Link:         http://www.biodiversitylibrary.org/bibliography/542
> ------------------------------------------
> Domain:       Botanicus
> Identifier:   b12066783
> Link:         http://www.botanicus.org/bibliography/b12066783
> ------------------------------------------
> Domain:       Digital Object Identifier
> Identifier:   10.5962/bhl.title.542
> Link:         http://dx.doi.org/10.5962/bhl.title.542
> ------------------------------------------
> Domain:       Eschmeyer References
> Identifier:   2787
> Link:
> http://researcharchive.calacademy.org/research/ichthyology/catalog/getref.as
> p?id=2787
> ------------------------------------------
> Domain:       FishBase - References
> Identifier:   1652
> Link:         http://www.fishbase.org/References/FBRefSummary.php?id=1652
> ------------------------------------------
> Domain:       Gallica
> Identifier:   ark:/12148/bpt6k99004c
> Link:         http://gallica.bnf.fr/ark:/12148/bpt6k99004c
> ------------------------------------------
> Domain:       Google Books
> Identifier:   yZM5AAAAcAAJ
> Link:         http://books.google.com/books?id=yZM5AAAAcAAJ
> ------------------------------------------
> Domain:       Hymenoptera Online - References
> Identifier:   978
> Link:         http://hol.osu.edu/reference-full.html?id=978
> ------------------------------------------
> Domain:       International Plant Names Index - Publications
> Identifier:   14210-2
> Link:         http://www.ipni.org/ipni/idPublicationSearch.do?id=14210-2
> ------------------------------------------
> Domain:       Internet Archive Text Records
> Identifier:   mobot31753000798865
> Link:         http://archive.org/details/mobot31753000798865
> ------------------------------------------
> Domain:       Library of Congress Control Number
> Identifier:   6017147
> Link:         http://lccn.loc.gov/06017147
> ------------------------------------------
> Domain:       Tropicos Reference Records
> Identifier:   1254
> Link:         http://www.tropicos.org/Publication/1254
> ------------------------------------------
> Domain:       World Registry of Marine Species - Reference
> Identifier:   8
> Link:         http://www.marinespecies.org/aphia.php?p=sourcedetails&id=8
> ------------------------------------------
> Domain:       ZooBank Publication
> Identifier:
> urn:lsid:zoobank.org:pub:2C6327E1-5560-4DB4-B9CA-76A0FA03D975
> Link:
> http://zoobank.org/urn:lsid:zoobank.org:pub:2C6327E1-5560-4DB4-B9CA-76A0FA03
> D975
> ------------------------------------------
>
> Obviously, we could include a whole bunch of other metadata as well (such
> as
> a link to the icon, more information about the Domain, etc.)
>
> The nice thing is that there is no limit to the number of cross-links we
> can
> establish using this system.  It's just a matter of building those links
> --
> which involves a process that can be partially automated, but also
> involves
> a lot of manual proofing.  This is where the "crowd sourcing" concept
> comes
> into play.
>
> In any case, Francisco -- it' you're interested in building these
> crosslinks
> for AnimalBase, please contact me off-list and I can describe the process
> for doing this.  We have some online tools that are designed to make this
> process as painless as possible (for example, Donat Agosti cross-linked
> all
> 800 journals from Hymenoptera Name Server in a few days using these
> tools).
> This, of course, applies to anyone on Taxacom who has a dataset with
> resolvable identifiers that would like to be cross-linked.  Even if you
> don't have resolvable identifiers, it's still worth building the
> cross-links
> so that your system can cross-link your records to all these other records
> and build links to them from your web pages.
>
>> 1 - Sometimes you have incorrect names in your database. How can we
>> correct them?
>
> ZooBank already allows for editing records.  There was a ZooBank Users
> Policy that was circulated to this list a while ago that explains who can
> edit what sorts of content. That Policy is in the final stages of
> ratification, but in brief, people with self-created (unverified) user
> accounts can edit their own records; verified users (someone has confirmed
> that you are who you say you are) can also edit records that he or she is
> an
> author of (even if that person didn't create the record); "Editors" can
> edit
> all content.
>
>> 2 - Sometimes more than 1 page represent the original source, but 2
>> pages,
>> which need 2 separate page links. In rare cases even 3 (I know some of
> those
>> cases in Swainson 1840).
>
> Yes, I've given this a lot of thought.  Of course, there is absolutely
> nothing preventing us from linking to multiple pages -- that's easy.
> However, there seems to be a general consensus among the GNA-folk that one
> page should be selected as the specific page on which a name comes into
> existence.  Paul Kirk: this is your cure to chime in on this topic.
>
>> An example with both case at once is here:
>>
>> http://zoobank.org/NomenclaturalActs/668885BB-153E-4A55-A65E-BA4EFA302618
>>
>> ZooBank says Buprestis gnita, correct would be Buprestis ignita Linnæus,
>> 1758.
>> Page numbers: 408, 824, both are important and should be linked.
>
> Ah!  Good catch, and good point!  Question to Paul Kirk:  How would you
> deal
> with this sort of situation?
>
> For now, I added the second page link, I corrected the spelling of the
> name,
> and I made a note about the pagination (go to the link above to see how I
> dealt with it).  For now, I think we'll keep the ZooBank webpage to only
> allow one page link to be made, and then I'll manage these exceptional
> cases
> for now.  We can always update the web page interface later, if we decide
> this is something that happens more often.
>
> Here's a question for people on this list:  In such cases, should both
> pages
> be displayed as thumbnails?  Is there an upper limit to how many pages can
> be displayed for a single name?  Is there a clear policy for what
> situations
> call for linking to multiple pages?  Should it include all pages that the
> name appears on, or just ones that have nomenclatural relevance?  In the
> latter case, how would "nomenclatural relevance" be defined?  These are
> the
> kinds of questions that the community should address.  My
> hope/dream/vision
> is that GNUB will become a resource of the taxonomic community, owned and
> maintained by the taxonomic community, and perpetuated for the taxonomic
> community (to paraphrase one of the more well-known former U.S.
> presidents).
> Thus, the taxonomic community should be the ones discussing these sorts of
> questions.
>
> In any case, I'll try to find the time to make similar corrections to all
> the names that appear on page 824.
>
>> Here you can see a list with human-corrected Linnean 1758 new names:
>>
>>
> http://www.animalbase.uni-goettingen.de/zooweb/servlet/AnimalBase/list/taxa?
> from_reference=4
>
> Now, that's a link that I (and ZooBank) already knew about! (see above)
>
> Thanks again for your very helpful feedback!
>
> Aloha,
> Rich
>
>


Francisco Welter-Schultes
Zoologisches Institut, Berliner Str. 28, D-37073 Goettingen
Phone +49 551 395536
http://www.animalbase.org


_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom Archive back to 1992 may be searched with either of these methods:

(1) by visiting http://taxacom.markmail.org

(2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

Celebrating 26 years of Taxacom in 2013.




More information about the Taxacom mailing list