[Taxacom] Sorry, but you are out-of-line

Doug Yanega dyanega at ucr.edu
Mon Nov 15 18:51:46 CST 2010


Steve Gaimari wrote:

>Digital representations are only that - digital representations. I 
>don't worry about these, and I don't worry about their needing to 
>have file format upgrades, or changing with changing technologies, 
>because they are just that - digital representations. They are not 
>the stagnant original form, which paper copies are. Digital 
>representations are not the original - they can change, in whatever 
>way you might imagine, given sufficient time and numerous upgrades 
>and new systems of data management on the horizon over the next 200 
>years. Since taxonomy is a field that relies upon historical 
>documentation, bastardizing that historical information does have a 
>clear effect, especially since we have no way to predict what the 
>future will bring with regards to digital archiving. We just do not 
>know. You seem very sure. I am not. We know it would be secure for 
>now, but we don't know what that security will look like in the 
>future. So to put all of our faith on things turning out as you 
>envision, in my mind, is taking a chance with the future of 
>taxonomy. It is not a chance I am willing to take, since the last 
>250 years are enough of a problem that needs fixing now to get a 
>full grasp of it.

The people out there who really know this stuff - for whom not only 
*their* entire livelihoods depend on it, but all of civilization, as 
well (because our governments, our banks, our public communications 
and transportation, everything is at their mercy) - seem quite 
confident that we have the technology to maintain any digital archive 
unaltered, indefinitely, if we want it to remain unaltered, 
indefinitely. I'm willing to accept that. Even if PDF as a format is 
gone 250 years from now, anything that is placed in a dedicated 
digital archive (such as GenBank) now as a PDF will *still* be there 
in 250 years, in whatever format they are using in 250 years. I'm 
willing to accept that. It's like gambling on whether the sun will 
rise tomorrow. If I lose, it's because the planet has been 
obliterated, in which case the bet hardly matters.

>The meter is based upon calculations from the physical world - it 
>does not rely upon being able to go back and check 150 years later, 
>which taxonomy does rely upon.

I believe that everything in GenBank will still be there in 150 
years, including date-stamped archives. I'd probably say the same 
about Wikipedia. In fact, the edits I've made to Wikipedia will 
probably outlast the paper copies of all of my actual printed 
publications.

>You say it's trivial, I disagree. I don't object in any way to 
>digital versions - whether they DO in fact last into perpetuity is 
>irrelevant to the discussion in 2010 - we have no way of knowing 
>whether they WILL in fact last into perpetuity, and to throw all of 
>our taxonomic eggs into that digital basket IS taking a risk.

I still have this suspicion that when you're talking about a digital 
archive and using phrases like "digital basket", you're thinking of 
some database run on some university computer, where some taxonomist 
is personally running the whole thing on a shoestring, and trying to 
keep abreast of format changes by manually migrating files and such. 
Would you accuse GenBank of engaging in unacceptably risky behavior 
by throwing all of our molecular biological eggs into a digital 
basket? Our data files are *no different* from theirs.

>Suggesting digital is digital does trivialize the need for 
>taxonomists to refer to original descriptions and nomenclatural 
>acts. If you are suggesting a paradigm where this is NOT a need for 
>taxonomists, then I will have to respectfully but strongly disagree 
>on that point.

No, I'm suggesting that we can make and refer to PDFs of Linnaeus' 
work now, and those same files will be around - in one format or 
another - centuries from now. I'm not saying the paper will be gone, 
but the digital versions will also still be here (and accessible to 
more people than actual copies of Linnaeus) if we create and maintain 
an archive like GenBank for them. If we create some second-rate, 
shoestring budget affair, then we're asking for trouble. I think we 
should be aiming a little higher, though, and that's my starting 
premise. If you tell me we CANNOT ever have a resource like GenBank 
at our disposal, then I think we're totally up the proverbial creek. 
My "impassioned" pleas here are in the hope that we WILL get our act 
together, and we WILL do it right, instead of half-baked. If we can't 
do it right, then there really wouldn't be much benefit to doing it 
at all.

>Your references to the ease of data security are purely speculation 
>in my view, since digital data security is relatively young field, 
>and I don't find it compelling to rely on the need for the perpetual 
>upgrading of outdated file formats to ensure the future of taxonomy.

When the upgrading is automatic, it's not like there's any gamble involved.

>I am willing to get onboard and say it would be highly desirable to 
>have a permanent centralized and distributed archive. This is a 
>pragmatic and useful thing to have available. However, I do NOT 
>think this is the matter that is most important. Archiving is data 
>storage, regardless of whether it is stagnant or dynamically 
>upgraded, in PDF format, in XML, whatever. The data are useful in 
>their own right, and it would be a positive thing for the world to 
>have access to these data. However, this archive should be just that 
>- a historical archive to serve the needs of the scientific 
>community - but NOT serve as THE sole container for the "original". 
>That is the inherent problem with e-only publication. There is this 
>assumption that it will somehow remain "original" into perpetuity, 
>or even to last into perpetuity at all. I just don't see the former 
>as being so, and I have my doubts on the latter.

Data are as secure and immutable as you want them to be; I'm willing 
to trust the techies on this issue, and assuming we would have 
competent techies doing the work for us.

>Regarding what Donat had demonstrated - he demonstrated how data can 
>be parsed and machine-read in an equivalent way to GenBank - that's 
>all fine and desirable - but that is decidedly not the same thing as 
>e-only publication, where that's ALL there is to a nomenclatural act.

An e-only publication, at this point (and for the foreseeable future) 
is generally going to exist as a PDF, and you can have a PDF made 
from something that was originally printed on paper - the two are 
precisely equivalent at that point, and anything you can do with a 
PDF of Linnaeus can be done with a PDF of something that was e-only 
at its inception - including keeping it accessible in perpetuity, 
unaltered.

Sincerely,
-- 

Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California, Riverside, CA 92521-0314        skype: dyanega
phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
              http://cache.ucr.edu/~heraty/yanega.html
   "There are some enterprises in which a careful disorderliness
         is the true method" - Herman Melville, Moby Dick, Chap. 82




More information about the Taxacom mailing list