[Taxacom] IPBES: a new challenge (not for cynics)

Fri Jan 14 05:22:31 CST 2011

To get back to the begin of this thread:

I need textual content - introduction, methods, results, conclusions, 
references - and image content. If you would ask me about the "data" 
I could extract from such a paper I would answer "please, the whole 
text and all the images".

Reading this, I understand, that I cannot take a specimen record straight
off the publication and assume that it belongs to a particular species. If
this is a priori so, then why do we publish?
I am aware that there are mistakes, similar to what is happening in GenBank.
But with access to more and more of the data and tools to detect strange
things, such as a sequence that must belong to a different taxon, we can
reduce the risk of wrong identification.
I am rather on the side that believes a priori that things are correct and
think this ought to be in the ethic of science to strive for this.

Otherwise, your scenario makes taxonomy completely obsolete - what does it
say about our environment? Clouds of thoughts

Donat

-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu
[mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Francisco
Welter-Schultes
Sent: Wednesday, January 12, 2011 4:12 PM
To: taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] IPBES: a new challenge (not for cynics)

Donat,
I expected you would object my statement. But I hold it. This one:

"If you don't have the copyright you cannot scan and extract data in 
a way that we could really work with it"

This is from my experience as a taxonomist, for what I now I need in 
my field (terrestrial malacology), things that I would need to 
facilitate my work. What kind of "data" would I need to extract from 
a 1970 paper? What do I need for my taxonomic work from such a 
publication?
I need textual content - introduction, methods, results, conclusions, 
references - and image content. If you would ask me about the "data" 
I could extract from such a paper I would answer "please, the whole 
text and all the images". 

> The descriptions or treatments are not protected by
> copyright law. So you can extract and reuse them. 

Not only descriptions are needed. I also need to know how scientists 
in the 1970s came to their conclusions. This is different than 
working with papers of the 1860s, where I only need the original 
descriptions of species and usually not much more. Here I need full 
access to the complete texts of discussions. I also need to know on 
which base a study was done. I need the full text of the 
introduction. It is only then that I will be able to judge the value 
of such a paper, the shortcomings, the strenghts. And very important 
in my field, I need images in high quality. Those papers are often 
equipped with high-quality photos, and it is important to see them in 
high quality.

Full textual content is coyrighted, and images are even much more 
strictly copyrighted. Public libraries are not allowed to scan 
copyrighted publications, and they strictly don't do it. Our 
library will not scan a book published after 1899, if you don't give 
them a written permission by the copyright holder. This is why 
literature after 1920 is largely missing in BHL and related projects.

Libraries are traditionally allowed to hold copyrighted publications 
and allow library users to read them. This is a traditional right 
that they would probably not obtain under today's copyright laws 
and conventions if they would not traditionally hold it since 
centuries ago. It is in the sense of this thought that they are not 
allowed to scan these works and by this way allow the online library 
user to read a book online. This is not allowed, and this is exactly 
the point where some international pressure would be useful to 
change this situation.

I partly share your concerns to encourage scientists to publish in 
open access, but since there are economic constraints behind this 
issue I do not think this is a promising approach to solve the 
problem. This looks like something that can only be solved by a shift 
in the legal conditions under which we are working.

And once again, access to papers published after 2000 is not the main 
problem for my work. I don't see big problems in the near future, 
also in contrast to Cristian's concerns. If you ask me, why is my 
taxonomic work so slow?, I would definitely not answer, because I 
have difficult access to post-2000 works (much less, because of the 
quality of some post-2000 publications is low, because they were 
not peer reviewed and reflect untenable personal views). The highest 
obstacle is currently the lack of electronic resources from the 
1920-1999 period.

I continuously publish papers in not-open-access, but I don't see a 
big problem. Being the author I get a PDF, I can send this PDF to 
anyone who is interested, they can forward it, that's okay for 
me.

> But the future, and the one where we might have a place is when we
> publish right from begin in a way that machine can read and reason
> over what we publish. This is when a connection between other fields
> and ours are being made. Not when we have to read through a pdf and
> extract data by hand one pdf after one pdf. 

Either there is an illusion behind this statement, or I 
misunderstand you. In my field I could not imagine any method that a 
machine would be able to judge the value of a scientific publication, 
and put it in relation to others (much less am I able to imagine a 
way to publish my ideas in a way that a machine could fully 
understand them and work with them automatically). It is 
indispensable that skilled and experienced human beings read papers 
written by human beings, understand them and extract information.

If some day there is a machine being invented that is able to 
translate German correctly into Spanish or English and vice versa, I 
might think over this point again.

Francisco

University of Goettingen, Germany
www.animalbase.org

_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these
methods:

(1) http://taxacom.markmail.org

Or (2) a Google search specified as:
site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here

__________ Information from ESET Smart Security, version of virus signature
database 5785 (20110113) __________

The message was checked by ESET Smart Security.

http://www.eset.com

__________ Information from ESET Smart Security, version of virus signature
database 5786 (20110114) __________

The message was checked by ESET Smart Security.

http://www.eset.com