[Taxacom] PDF as archive format

Dave Roberts workpackage6 at gmail.com
Tue Nov 16 08:02:30 CST 2010


Dear David,

I suspect that your problems are from PDFs generated from a scanned  
image and using Adobe's OCR tool (built into Acrobat).  This indeed  
does not perform very well by modern standards.

PDFs generated from born-digital documents, on the other hand, are  
perfectly rendered.

Cheers,  Dave
--
On 16 Nov 2010, at 13:34, David Campbell wrote:

> PDF does not seem to be ideal in its relatively poor character
> recognition.  Try to electronically copy something from a PDF as text
> and you generate typos; similar problems affect searches.

-- 
Dr D.McL. Roberts,        Tel: +44 (0)20 7942 5086
European Distributed Institute of Taxonomy Project,
Coordinator WorkPackage 6 (Unifying Revisionary Taxonomy),
Dept. Zoology,
The Natural History Museum,
Cromwell Road,
London        SW7 5BD
Great Britain             Email: dmr at nomencurator dot org
Web page:  http://scratchpads.eu
Web page:  http://www.editwebrevisions.info/
--
"You can't just ask customers what they want and then try and give it  
to them.  By the time you get it built, they'll want something  
new." [Steve Jobs, quoted in The Guardian, Technology Section, 25 June  
09].
--









More information about the Taxacom mailing list