[Taxacom] Nomenclature of PDF documents

Doug Yanega dyanega at ucr.edu
Wed May 27 13:00:21 CDT 2009


Fabio Moretzsohn wrote:

>We hear the argument of the advantages of electronic publications and
>that thousand of identical copies are saved in computers across the
>world. But how do you reconcile files with potentially different names
>and dates? I typically rename PDFs I download or create as a short
>citation, with the name of the first author(s), date, and a short title
>that makes sense to me. However, even if I saved the document with the
>same name as downloaded from a journal, when I save it to my computer
>the date that shows is the date it was saved; the date it was originally
>created may be stored somewhere, but it is not what most people see.

This is precisely why a single, central repository is needed. We want 
to be able to IGNORE all those thousands of individual copies (not 
that their existence is not of potential value), and their 
idiosyncratic file names. The system works best with just ONE 
authoritative unalterable copy in a proper archive that is linked 
with all the necessary metadata to make sure that if someone wants to 
find a PDF (or XML doc, or whatever form the archive takes) of paper 
X, then that archive should be the top Google hit, or near the top. 
Then we can avoid the whole issue of "document nomenclature".

This relates to what self-proclaimed "luddite" Stuart Fulleron wrote:

>ah yes - and in 15 years all this will be outdated and we will start all
>over again with the latest and newest and "forever" system that will not
>work on anything some of us will still have at that time.  meanwhile i
>find that a good reprint collection and an old fashioned 3 x 5 card
system works just fine for my needs.

This is the difference between researchers keeping software and files 
ON THEIR OWN COMPUTERS and having a permanent central repository. You 
would never, ever, need your index cards again if you were no longer 
personally the custodian of the *data* on those cards. If someone 
else keeps the data safe and accessible, then it no longer matters 
what computer you own, what software you install, or how outdated 
anything you personally use happens to be: if you have a functional 
browser, then the data is there whenever you want it, be it 15 years 
from now or 150. Having a permanent central repository is the only 
practical way to AVOID things becoming outdated. Realistically, the 
only "tricky" aspect is that a central repository will have only a 
limited set of data formats it can accept or produce; if someone 
stores data on their personal computer in some form that is 
absolutely impossible to translate into a standard form, then it may 
never be possible to archive it. All the more reason for people to 
STOP keeping important work on their own computers.

Peace,
-- 

Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California, Riverside, CA 92521-0314        skype: dyanega
phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
              http://cache.ucr.edu/~heraty/yanega.html
   "There are some enterprises in which a careful disorderliness
         is the true method" - Herman Melville, Moby Dick, Chap. 82




More information about the Taxacom mailing list