[Taxacom] Centrally supported electronic archive

Mary Barkworth Mary at biology.usu.edu
Wed May 27 06:56:16 CDT 2009


The OCRing is useful. I am not sure that "discovering the treatments"
is. The point was made that parts of a protologue may be widely
scattered (consisting of several "fragments") which is why access to the
whole of a work is desirable. Unless by "discovering the treatments" you
mean such things as identifying starts of chapters, articles, or
sections. Am I right in assuming that thepart of OCRing that is
time-consuming is verification/proof-reading?

-----Original Message-----
From: Donat Agosti [mailto:agosti at amnh.org] 
Sent: Wednesday, May 27, 2009 5:48 AM
To: Mary Barkworth; taxacom at mailman.nhm.ku.edu
Cc: Peter B. Phillipson; taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Centrally supported electronic archive


Over all, the scanning itself is the least expensive part. OCR-ing and
extracting the treatments takes much more time and expertise, even
though
scanning properly is an art in itself...
I think, there is nothing insignificant in this process. At the same
time,
a huge number of colleagues are scanning their documents independently
that sharing this burden. If there would be a way to discover them, so
that they then could be used for further processing, then we would have
resolved one of the first bottlenecks in the transformation process.
BHL,
if I am right, is looking into this sort of archive - aren't you, Chris?

Donat


> The problem, at least with articles and books that are not already
> scanned, is surely the cost of scanning, particularly if the work is
old
> or rare. That is not insignificant.
>
>
> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu
> [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Peter B.
> Phillipson
> Sent: Wednesday, May 27, 2009 3:33 AM
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] Centrally supported electronic archive
>
> I do want to deal with the whole article.....
>
> We should all be encouraged to read the entire paper or chapter in
which
> a
> protologue (or any nomenclatural change) is published, there is often
> crucial information about the whereabouts of specimens the author has
> cited
> and other valuable information in an introduction, illustration or
> elsewhere
> in an article, that can aid interpretation of the original author's
> intentions, especially in older publications.
>
> I have often been frustrated in the past by requesting the page
numbers
> cited for a particular protologue through inter-library loans, only to
> discover that essential parts of a protologue and its context were
> missing.
> With electronic media it doesn't usually cost more to send or download
> all
> the pages of an article than just the 1 page that contains the bare
> minimum,
> so why cut corners?
>
> We should also encourage authors (and databasers) to be as
comprehensive
> as
> possible in citing earlier taxonomic references, so that it is easier
> for
> future generations to obtain all the relevant pages of a publication -
> citing both the entire publication and the specific pages that contain
> all
> of the elements of a protologue.
>
> Pete Phillipson
>
> -----Original Message-----
> From: taxacom-bounces at mailman.nhm.ku.edu
> [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Paul van
> Rijckevorsel
> Sent: 27 May 2009 08:44
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] Centrally supported electronic archive
>
>
> From: "Jim Croft" <jim.croft at gmail.com>
> Sent: Tuesday, May 26, 2009 11:59 PM
>
>> When
>> someone calls [f]or the protologue, we do not want to send them the
>> whole article.  With limited resources we can not afford to scan
an[d]
>
>> store the whole article when all we want is one page of it...
>
> ***
> Yes, an important issue: if all you want is the protologue, you do not
> want
> to have to deal with a whole article. However, a complicating factor
is
> that
> from a nomenclatural perspective it is not necessarily immediately
> apparent
> what the protologue is; in fact it needs to be be 'circumscribed' from
> case
> to case. In the modern literature this will (almost always) be
> straightforward, but the introduction, etc to a book or article may
also
> contain material that belongs to the protologue. Say, the
> Acknowlegdements
> may comment: "we are deeply grateful for the hospitality of Mr
> Przilowsky;
> in acknowledgement we have named our third species in honour of his
> eldest
> daughter". Theoretically, there may be a separation of hundreds of
pages
> between one part of the protologue and another.
>
> ["Protologue ...: everything associated with a name at its valid
> publication, i.e. description or diagnosis, illustrations, references,
> synonymy, geographical data, citation of specimens, discussion, and
> comments."]
>
> It is not required that all the requirements of valid publication are
> met in
> a single publication; the final 'validating' publication only needs to
> refer
> to all the required parts, which need to have been effectively
published
> earlier. For example the final publication may be a few lines only,
but
> refer to a page-filling illustration elsewhere. So a protologue can be
> spread over more than one publication. All in all, 'circumscribing' a
> protologue is not a trivial matter. However, if the result goes into
an
> accessible database, it need be done only once.
>
> Paul
>
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of
> these
> methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of
> these methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>


-- 
Dr. Donat Agosti
Research Associate, American Museum of Natural History and Smithsonian
Institution

Email: agosti at amnh.org
Web: http://antbase.org
CV:
http://research.amnh.org/entomology/social_insects/agosticv_2003.html

Swiss Residence
Elahieh
Ave. Khazer no. 74
19649 Teheran
Iran

+98-21-2200 8765 (office)
+98-21-2260 6160 (home)
+98-919-489 2744 (mobile)
+1-202-558 0330 (skype-in US)
+41-44-5862911 (skype-in Switzerland)






More information about the Taxacom mailing list