[Taxacom] ZooBank Progress

Richard Pyle deepreef at bishopmuseum.org
Sat Apr 27 20:02:33 CDT 2013


Thanks, Francisco, for this very well-thought-out and detailed reply!  I
will digest it a bit, then come back to you again early next week with some
follow-up questions.

This is really good stuff.

Aloha,
Rich

> -----Original Message-----
> From: Francisco Welter-Schultes [mailto:fwelter at gwdg.de]
> Sent: Saturday, April 27, 2013 1:26 PM
> To: Richard Pyle
> Cc: 'Francisco Welter-Schultes'; 'Roderic Page'; 'TAXACOM'; Robert Whitton
> Subject: RE: [Taxacom] ZooBank Progress
> 
> Thanks a lot once again Rich, for putting so much detailed knowledge and
> skills into this project.
> 
> Let me select some of your onlist questions.
> 
> Link(s) to the original source from ZooBank:
> 
> > Should it include all pages that the
> > name appears on, or just ones that have nomenclatural relevance?
> 
> Only those that have nomenclatural relevance. And only the first page(s).
> 
> > In the
> > latter case, how would "nomenclatural relevance" be defined?
> 
> If one page: Only the first page of the very occasion where the name was
> mentioned and equipped with something that made this name available.
> 
> If several pages: these other pages must bear high nomenclatural relevance
> (the identity or spelling of the name would be different if that other
page
> would not exist).
> 
> We have special cases where a name was mentioned, and on another page
> in the same work the description was given and the name was NOT repeated
> - and where that other page was published at a later date. In that case
the
> name was mentioned at the first date and made available at the second
> date. Here both pages should be given.
> 
> 
> > Here's a question for people on this list:  In such cases, should both
> > pages be displayed as thumbnails?
> 
> I would say yes, this would certainly be useful.
> 
> > Is there an upper limit to how many pages can be displayed for a
> > single name?
> 
> My personal experience points to an upper limit of 3 pages.
> In Philippe Bouchet's Index Rocroi of 30,000 generic names of molluscs I
saw
> some instances with 4 pages given. But in those cases which I verified I
came
> to the conclusion that not all citations were necessary. I never saw a
case
> with 5 or more necessary pages.
> 
> > Is there a clear policy for what
> > situations
> > call for linking to multiple pages?
> 
> A clear policy would be useful. In the AnimalBase team we checked about
> 50,000 names manually at the original sources and developed some internal
> policies or guidelines for this purpose.
> 
> - the usual case: If the new name was mentioned, and on the same page the
> description was given, or a bibliographical reference to one, or a link to
a
> figure was given, then cross-link to that page.
> 
> - Cross-link only to the first page (also if the description extends on
various
> pages, with or without interruptions).
> 
> - If the description started several pages previously and the name was
> mentioned only at the end of the description, then do not cross-link to
the
> first page, but to the page where the name was mentioned, at the end of
> the description.
> 
> - If you cross-link to one single page, whichever page you select: the
name
> must have been printed on that page. Never cross-link to one single page
if
> the name was not mentioned on it.
> 
> - If the name was only mentioned in the title of the article and nowhere
else
> in the article, then cross-link to the title page.
> 
> - If the name was mentioned in the title and inside the text, then
cross-link
> to the page inside the text.
> 
> - If the name was mentioned at one occasion and it is unclear to the non-
> insider where the description was (or bibliographical reference or
figure),
> then cross-link to both pages.
> 
> - In cases of genera, if the name was established with a description at
one
> instance, and if it is unclear to the non-insider where the species were
> included elsewhere in the text, then cross-link to both pages (description
of
> genus, and included species). If other species were included once again
> surprisingly on another page, then add also that third page (Swainson 1840
> presented such a chaos).
> 
> - If the name was mentioned in the text without description, and if the
name
> was also printed on a plate with a figure, then cross-link only to the
plate, not
> to the text page.
> 
> - If the name was given in two different spellings on two different pages,
> then cross-link to those two different pages. If three different spellings
were
> given, then cross-link to those three pages.
> 
> - If a new name was mentioned on both a text page with description, and on
> a plate with a figure, then cross-link to the text page. You can also add
the
> plate, particularly if the plate was not mentioned in the text.
> 
> - Avoid cross-linking to plates if this is not necessary.
> 
> - Avoid cross-linking to plates if no names were mentioned on them.
> 
> - If you cross-link to two separate pages (a plate is a page in this
sense), then
> NEVER cross-link to pages that were published at different dates if the
name
> was made available at the first date.
> 
> - If the name was mentioned on one page with a reference to a description
> or figure that was published in the same work on another page at a later
> date, then cross-link to both pages, and explain this in a comment.
> 
> In AnimalBase we have the option to add verbal comments, this makes it
> easier to handle weird and complex cases.
> 
> Maybe we have some more internal rules, but I think these are more or less
> the most important ones.
> 
> The first rule covers more than 95 % of the cases.
> My experience is that the rare ugly and difficult cases accumulate among
> those where researchers tend to search for help. So for us, it's worth to
keep
> an eye on them.
> 
> All the best
> Francisco
> 
> 
> 
> > Many thanks, Francisco!
> >
> >> In 2003 we applied to the German Science Foundation to do link on
> >> page level the Linnean names with the original sources. They allowed
> >> is to
> > create
> >> AnimalBase, but denied our request for a page level linking, arguing
> >> that
> > this
> >> old literature is not of any interest any more today...
> >> 10 years later you are doing exactly this and the results looks
> >> great! I
> > am
> >> really happy that you finally did it. This is very useful.
> >
> > Good to know!  The intention of GNA is to build these cross-links very
> > broadly, and in a way that's very easy to access -- either by
> > downloading a copy of the cross-link index, or by using simple
> > services to access it in real time.  Back in the earliest days of
> > ZooBank I wanted to establish a few examples of this sort of
> > cross-linking functionality, and one of the cross-links I established
> > was with AnimalBase. For example, if you go to the record for Linnaeus
> > 1758 (which is a bit slow to load, because there are nearly 5000 names
> > that are loaded for this publication), you can see the list of icons
> > under the ZooBank LSID for all the cross-links:
> > http://zoobank.org/References/2C6327E1-5560-4DB4-B9CA-76A0FA03D975
> > One of them is the record for AnimalBase.
> >
> > What I would *love* to do is establish all the cross-links between
> > AnimalBase Authors, literature and names (I think you and I discussed
> > this once -- I believe we were on the Tube in London at the time....)
> > That way, any link to BHL that is made via ZooBank, would
> > simultaneously be made for AnimalBase.  Thus, you could either have a
> > copy of the cross-linked index on your server that allows a page-link
> > to BHL to appear on AnimalBase whenever someone establishes one on
> > ZooBank; or we could make a simple service on GNUB (where the
> > cross-links actually reside) such that your page could call the
> > service for a given record of yours, and get back all the cross-links
> > to other databases that GNUB knows about (including BHL, and others).
> > For example, if you passed in your AnimalBase reference identifier "4"
> > to this service (as well as an indication that the "4" applies to an
> > AnimalBase Reference identifier), then the service would give you back
> > something that looks like this:
> >
> > ------------------------------------------
> > Domain:	Amphibian Species of the World References
> > Identifier:	2794
> > Link:		URL:
> > http://research.amnh.org/vz/herpetology/amphibia/?action=bib&id=2794
> > ------------------------------------------
> > Domain:	AnimalBase References
> > Identifier:	4
> > Link:
> > http://www.animalbase.uni-
> goettingen.de/zooweb/servlet/AnimalBase/home
> > /refer
> > ence?id=4
> > ------------------------------------------
> > Domain:	Biodiversity Heritage Library Title
> > Identifier:	542
> > Link:		http://www.biodiversitylibrary.org/bibliography/542
> > ------------------------------------------
> > Domain:	Botanicus
> > Identifier:	b12066783
> > Link:		http://www.botanicus.org/bibliography/b12066783
> > ------------------------------------------
> > Domain:	Digital Object Identifier
> > Identifier:	10.5962/bhl.title.542
> > Link:		http://dx.doi.org/10.5962/bhl.title.542
> > ------------------------------------------
> > Domain:	Eschmeyer References
> > Identifier:	2787
> > Link:
> > http://researcharchive.calacademy.org/research/ichthyology/catalog/get
> > ref.as
> > p?id=2787
> > ------------------------------------------
> > Domain:	FishBase - References
> > Identifier:	1652
> > Link:
> 	http://www.fishbase.org/References/FBRefSummary.php?id=1652
> > ------------------------------------------
> > Domain:	Gallica
> > Identifier:	ark:/12148/bpt6k99004c
> > Link:		http://gallica.bnf.fr/ark:/12148/bpt6k99004c
> > ------------------------------------------
> > Domain:	Google Books
> > Identifier:	yZM5AAAAcAAJ
> > Link:		http://books.google.com/books?id=yZM5AAAAcAAJ
> > ------------------------------------------
> > Domain:	Hymenoptera Online - References
> > Identifier:	978
> > Link:		http://hol.osu.edu/reference-full.html?id=978
> > ------------------------------------------
> > Domain:	International Plant Names Index - Publications
> > Identifier:	14210-2
> > Link:
http://www.ipni.org/ipni/idPublicationSearch.do?id=14210-2
> > ------------------------------------------
> > Domain:	Internet Archive Text Records
> > Identifier:	mobot31753000798865
> > Link:		http://archive.org/details/mobot31753000798865
> > ------------------------------------------
> > Domain:	Library of Congress Control Number
> > Identifier:	6017147
> > Link:		http://lccn.loc.gov/06017147
> > ------------------------------------------
> > Domain:	Tropicos Reference Records
> > Identifier:	1254
> > Link:		http://www.tropicos.org/Publication/1254
> > ------------------------------------------
> > Domain:	World Registry of Marine Species - Reference
> > Identifier:	8
> > Link:
> 	http://www.marinespecies.org/aphia.php?p=sourcedetails&id=8
> > ------------------------------------------
> > Domain:	ZooBank Publication
> > Identifier:
> > urn:lsid:zoobank.org:pub:2C6327E1-5560-4DB4-B9CA-76A0FA03D975
> > Link:
> > http://zoobank.org/urn:lsid:zoobank.org:pub:2C6327E1-5560-4DB4-B9CA-
> 76
> > A0FA03
> > D975
> > ------------------------------------------
> >
> > Obviously, we could include a whole bunch of other metadata as well
> > (such as a link to the icon, more information about the Domain, etc.)
> >
> > The nice thing is that there is no limit to the number of cross-links
> > we can establish using this system.  It's just a matter of building
> > those links
> > --
> > which involves a process that can be partially automated, but also
> > involves a lot of manual proofing.  This is where the "crowd sourcing"
> > concept comes into play.
> >
> > In any case, Francisco -- it' you're interested in building these
> > crosslinks for AnimalBase, please contact me off-list and I can
> > describe the process for doing this.  We have some online tools that
> > are designed to make this process as painless as possible (for
> > example, Donat Agosti cross-linked all
> > 800 journals from Hymenoptera Name Server in a few days using these
> > tools).
> > This, of course, applies to anyone on Taxacom who has a dataset with
> > resolvable identifiers that would like to be cross-linked.  Even if
> > you don't have resolvable identifiers, it's still worth building the
> > cross-links so that your system can cross-link your records to all
> > these other records and build links to them from your web pages.
> >
> >> 1 - Sometimes you have incorrect names in your database. How can we
> >> correct them?
> >
> > ZooBank already allows for editing records.  There was a ZooBank Users
> > Policy that was circulated to this list a while ago that explains who
> > can edit what sorts of content. That Policy is in the final stages of
> > ratification, but in brief, people with self-created (unverified) user
> > accounts can edit their own records; verified users (someone has
> > confirmed that you are who you say you are) can also edit records that
> > he or she is an author of (even if that person didn't create the
> > record); "Editors" can edit all content.
> >
> >> 2 - Sometimes more than 1 page represent the original source, but 2
> >> pages, which need 2 separate page links. In rare cases even 3 (I know
> >> some of
> > those
> >> cases in Swainson 1840).
> >
> > Yes, I've given this a lot of thought.  Of course, there is absolutely
> > nothing preventing us from linking to multiple pages -- that's easy.
> > However, there seems to be a general consensus among the GNA-folk that
> > one page should be selected as the specific page on which a name comes
> > into existence.  Paul Kirk: this is your cure to chime in on this topic.
> >
> >> An example with both case at once is here:
> >>
> >> http://zoobank.org/NomenclaturalActs/668885BB-153E-4A55-A65E-
> BA4EFA30
> >> 2618
> >>
> >> ZooBank says Buprestis gnita, correct would be Buprestis ignita
> >> Linnæus, 1758.
> >> Page numbers: 408, 824, both are important and should be linked.
> >
> > Ah!  Good catch, and good point!  Question to Paul Kirk:  How would
> > you deal with this sort of situation?
> >
> > For now, I added the second page link, I corrected the spelling of the
> > name, and I made a note about the pagination (go to the link above to
> > see how I dealt with it).  For now, I think we'll keep the ZooBank
> > webpage to only allow one page link to be made, and then I'll manage
> > these exceptional cases for now.  We can always update the web page
> > interface later, if we decide this is something that happens more
> > often.
> >
> > Here's a question for people on this list:  In such cases, should both
> > pages be displayed as thumbnails?  Is there an upper limit to how many
> > pages can be displayed for a single name?  Is there a clear policy for
> > what situations call for linking to multiple pages?  Should it include
> > all pages that the name appears on, or just ones that have
> > nomenclatural relevance?  In the latter case, how would "nomenclatural
> > relevance" be defined?  These are the kinds of questions that the
> > community should address.  My hope/dream/vision is that GNUB will
> > become a resource of the taxonomic community, owned and maintained
> by
> > the taxonomic community, and perpetuated for the taxonomic community
> > (to paraphrase one of the more well-known former U.S.
> > presidents).
> > Thus, the taxonomic community should be the ones discussing these
> > sorts of questions.
> >
> > In any case, I'll try to find the time to make similar corrections to
> > all the names that appear on page 824.
> >
> >> Here you can see a list with human-corrected Linnean 1758 new names:
> >>
> >>
> > http://www.animalbase.uni-
> goettingen.de/zooweb/servlet/AnimalBase/list/taxa?
> > from_reference=4
> >
> > Now, that's a link that I (and ZooBank) already knew about! (see
> > above)
> >
> > Thanks again for your very helpful feedback!
> >
> > Aloha,
> > Rich
> >
> >
> 
> 
> Francisco Welter-Schultes
> Zoologisches Institut, Berliner Str. 28, D-37073 Goettingen Phone +49 551
> 395536 http://www.animalbase.org





More information about the Taxacom mailing list