[Taxacom] Important note Re: two names online published - one new species

Thu Jan 28 14:56:09 CST 2016

The issue is, that we neither now nor have access to the publications and the names therein. If all articles would have to be registered at Zoobank, irrespective if they ore e-only or not and a pdf copy is available, and the names are registered at zoobank, then we do not have this problem solved at once.

We have all this in place, no technology needs be developed, but we keep bridling at this option and keep discussing things that we will not and cannot control with our system.

Furthermore, if we want taxonomy to play a role in life sciences we need to convert to such as system. A system, that also allows mining content, or even better provide the content in a form that third parties can use, link and thus make our data part of big data.

Only this openness will raise the value of new research, new data, the creation of specialists who can make sound taxonomic (scientific decisions).

Again, this discussion on this list serve is a great disservice to the community, not least because priority is such as minuscule problem in understanding the diversity of life. It just gives the wrong impression where the priorities of our community is. The problem, the huge murderous problem is, that we even today do not know what we describe as new species, how they look like, can provide a link from GenBank or BOLD to the respective taxonomic treatment that everybody can consult, finds link to external resources, and ultimately can use the data for their purpose - one of the most important is to save diversity of life.

Donat

-----Original Message-----
From: Taxacom [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Richard Pyle
Sent: Thursday, January 28, 2016 7:58 PM
To: 'Laurent Raty' <l.raty at skynet.be>; taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Important note Re: two names online published - one new species

I agree with everything Laurent says below, but I don't see that as the real problem.

I believe the following scenario is not as rare as some people would believe; and indeed may be increasingly common:

1) Journal issues a provisional electronic edition online, and clearly indicates it as such (no LSID)
2) A revised version, including LSID (and properly registered with archive, etc.) is posted online, and the correct date of publication indicated. Pagination is from 1-20.
3) An important error is discovered, and a revised version is posted online, replacing the previous one, and the website (but not the PDF) indicates that it was revised.  The PDF contains the original date, and Pagination is 1-20.
4) A paper edition is produced, which includes the corrected error, and indicates the correct date of publication for the paper edition. Pagination is 364-384.

Each of the above happens on a different date, in the chronological order indicated.

Most of us would probably agree that #1 is not published in the sense of the Code, based on the missing LSID.  Even if there was an LSID included, we could probably all agree that Art. 9.9 applies, and it's not published in the sense of the Code.

At the time #2 was obtainable (on the date indicated within the work itself), it was intended by the publisher as the "version of record".  There is no evidence in the work itself, or on the website, that it's not the final version. 

So, how do we interpret #3?  Is it the "real" version of record, retroactively making #2 unavailable under Art 9.9?  Is it a distinct published work, establishing a new objective synonym and homonym that we must track?  Assuming both #2 and #3 include the same ZooBank LSID, which version is the LSID "actually" associated with? Does it matter which version is deposited in an archive?  What if neither version is ever deposited in the intended archive?  What if both are?

Or, does it depend on the nature of the error that was corrected?  Examples could include:
- Correction of the word "teh" to "the" in the abstract
- Addition of an accent to a character in an author's name
- Revised or corrected map showing the distribution of the taxon
- Correct spelling of the genus name for a new species-group name
- Altered spelling of the new species-group name itself
- Addition of the location of the collection where the type specimen is to be deposited
- etc., etc.

Some of these have relevance to nomenclature, some do not.  Does that matter in our determination of which edition is the "version of record" that should be considered as part of the public and permanent scientific record, and thereby represent the date of availability for purposes of nomenclatural priority? Do we need an enumeration of all possible changes that do result in a changed "version of record"?

And what about the changed page numbers in the paper edition?  For those who don't like the "metadata" argument, are you suggesting that the paper edition represents a new published work (with objective synonyms and homonyms) simply because the paper edition is not an "exact copy" of the electronic edition?  Even if the page numbers were identical, how does one define "exact copy" in such a way that one physical object consisting of paper pages with ink on them is an "exact copy" of a binary object stored on a computer?

I'm sure we could argue about it enough to come to some sort of consensus on this specific example.  But there are a near-infinite number of possible examples out there, and the scope of possible examples will probably continue to expand in the future. Why?  Because despite what some have argued, electronic dissemination of scientific information is still very much in its infancy. The playing field is constantly evolving.  Electronic publication began as a digital representation of a paper work (e.g., a scanned image of the actual printed pages).  As time goes on, publishers are increasingly exploiting the power of electronic information and its dissemination (and rightly so). As we move closer to a world that resembles the vision of a Semantic Web, the parallels between the old paper-based publication world and modern electronic means of information exchange will evaporate to the point where they are essentially unrecognizable.

This "problem" isn't going away; it's going to get worse. Even God Herself would be challenged to come up with wording in a revised Code that accommodated all conceivable scenarios.

I completely understand why we still cling to the old notions of "publication", where the economics  of producing multiple subtly different versions of a work produced as thousands of copies on paper effectively ensured that problems of the sort described above were rare outliers. The new electronic information dissemination model completely changes the cost-effectiveness of producing incrementally altered versions of pseudo-static works.  We could "encourage" publishers to respect our traditional notions of publication, but how effective will that campaign be?  And do we really want to burden the field of taxonomy with additional handicaps? (Even if we could?)

We are tasked with finding a way to maintain nomenclatural stability in the context of this rapidly changing playing field. I find it helpful to step back and remember what, exactly, "stability" means, and how, fundamentally, we attempt to achieve it.
- A system of latin words universally shared and used as labels for taxa
- A mechanism for unambiguously linking the names to the biological world through type specimens
- A mechanism for unambiguously establishing priority among potentially competing names (subjective synonyms; homonyms)

That's really the essence of nomenclatural stability.  We still need a complex series of rules to deal with legacy names until a complete and universal registry exists (i.e., the uber-LAN).  However, if we continue to try to force-fit the rapidly changing modes of electronic information exchange in science into a model that was fundamentally designed around ink-on-paper documents, these problems will continue to dominate our time and energy.

We can probably maintain the status quo for a few more years; but if we don't get serious about fundamentally adjusting (and future-proofing) our system of nomenclatural availability (and, by extension, stability), then the "problems" we fret about now will seem trivial compared to what's ahead.

Aloha,
Rich

Richard L. Pyle, PhD
Database Coordinator for Natural Sciences | Associate Zoologist in Ichthyology | Dive Safety Officer Department of Natural Sciences, Bishop Museum, 1525 Bernice St., Honolulu, HI 96817
Ph: (808)848-4115, Fax: (808)847-8252 email: deepreef at bishopmuseum.org http://hbs.bishopmuseum.org/staff/pylerichard.html

> -----Original Message-----
> From: Taxacom [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of 
> Laurent Raty
> Sent: Thursday, January 28, 2016 3:30 AM
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] Important note Re: two names online published - 
> one new species
> 
> Producing an "exact copy" (bit-for-bit) of a pdf file is, on the 
> contrary, one of the easiest things to do. Just select the file in 
> your file manager and hit <Ctrl>-C, <Ctrl>-V: done. Of course, in a 
> vanishingly small proportion of the cases, you may get a "mutation", 
> and end up with a corrupt file. However, this is not a real problem, 
> as it is also extremely easy to check that a file is an "exact copy" of another file, using things like hash values / checksums.
> 
> On the other hand, checking whether the non-metadata portion of the 
> content and layout that will be displayed when viewing a pdf file is 
> the same as that which will be displayed when viewing another pdf 
> file, that otherwise differs, is a nightmare. (Most likely plain impossible.) If you adopt any "copy"
> concept that departs from the "exact", bit-for-bit copy, you basically 
> accept, knowingly, never to be able to check for the integrity of a 
> work in pdf format.
> 
> The problem (?) is that some publishers NEVER produce pdf files that 
> are "exact copies". If you download twice the same work from, say, 
> http://onlinelibrary.wiley.com/ , the two files that that you get will 
> be "exact copies" of each other. But if you do the same from, eg., 
> http://www.tandfonline.com , the files will differ: each downloaded "copy"
> is in fact a *new* pdf file, generated on demand by the website, with 
> each page "tagged" in the margin with your IP and the time of download. If "copy"
> means "exact copy", this method does not produces "copies" of a single 
> work at all, it produces a unique file at each download, and nothing 
> is published (Art. 8.1.3.2 not satisfied).
> 
> Cheers, Laurent -

_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be searched at: http://taxacom.markmail.org

Channeling Intellectual Exuberance for 29 years in 2016.