[Taxacom] GBIF data
Faunaplan at aol.com
Faunaplan at aol.com
Wed Nov 22 15:24:05 CST 2006
Dear all,
GBIF's data on geographic occurrences of the world's living species are still
highly fragmentary and, in part, rather unreliable esp. in regard to insects.
I'm just wondering why not open yet another gate for a large community of
data holders who could support GBIF's mission, - more user-friendly, simple and
much faster than current procedures.
Here are some musings, - focussing on data from entomological collections:
Insects make up the lion's share of species diversity but reliable occurrence
data are still mainly written on hundreds of millions of labels pinned to
specimens in museums and collections, - practically inaccessible for most
potential users.
Instead of trying to digitize these data according to a highly complex XML
schema, why not just take data as they are from these specimen labels and put
them into a simple (flat) data file, say - a pipe delimited text file like in
the following example:
ID-LABEL|LOC-LABEL|SAMPLESIZE|COLLECTION
"Amara (Brad.)\ majuscula Chd.\ det. Hieke 1982"|"Schwanheim a.M.\ Feld,
18.6.53\ coll. H.Hesse"|1|"ZSM"
"Amara (s.str.)\ pindica Apf.\ det. Hieke 1969"|"Collection\ Strasser\\
Caucasus\\ Eriwan"|1|"ZSM"
"Harpalus\ Winkleri Schbg.\ det. Dr. E.Schaub."|"Mokra pl.\\ Golesnica pl.\
Macedonia\\ Sammlung\ Apfelbeck"|1|"ZSM"
In a second step, we can add a few fields in order to standardize and unlock
the treasure, e.g.:
- COUNTRY (standard names or codes)
- LATLONG (containing a gross latitude/longitude georeference)
- VALIDNAME (containing the standardized current taxonomic name)
example as above:
COUNTRY|LATLONG|VALIDNAME|ID-LABEL|LOC-LABEL|SAMPLESIZE|COLLECTION
Germany|NE50008|Amara majuscula|"Amara (Brad.)\ majuscula Chd.\ det. Hieke
1982"|"Schwanheim a.M.\ Feld, 18.6.53\ coll. H.Hesse"|1|"ZSM"
Armenia|NE40044|Amara proxima|"Amara (s.str.)\ pindica Apf.\ det. Hieke
1969"|"Collection\ Strasser\\ Caucasus\\ Eriwan"|1|"ZSM"
Macedonia|NE41021|Harpalus xanthopus|"Harpalus\ Winkleri Schbg.\ det. Dr.
E.Schaub."|"Mokra pl.\\ Golesnica pl.\ Macedonia\\ Sammlung\ Apfelbeck"|1|"ZSM"
Additional steps, e.g. more detailed georeferencing, could follow later
whenever needed...
Such simple files could be produced during routine curatorial or ID work and
sharing them via email should be very easy. GBIF could use these handy
datasets to create dynamic distribution maps generated by a rather simple PHP/ MySQL
application. On the new GBIF data portal prototype, I've already seen a pixel
worldmap that seems to be perfectly fit for the display of standardized
(one-degree latlong) occurrence data by accurrate 2X2 pixel dots.
In addition to such maps, there could be a display of corresponding
standardized background information (e.g., VALIDNAME, COUNTRY, LATLONG, COLLECTION) in
simple data lists. Both maps and lists could be downloaded by the user with
just a click.
In case access to full data details is seen as problematic (see previous
discussions on "sensitive data"), the field LOC-LABEL could be kept under
restricted access while other essential information is freely accessible. Tracking
down all information to the sources would be rather easy anyway.
As a side effect, this could also become a central database for collection
inventaries, - wouldn't it be more useful than individual inventaries on each
Museum's website?
"Free and open access to the world's biodiversity data through the
collaborative medium of the Web is an important tool for the sustainable stewardship of
Earth. Unlocking such data will lead to much better policy and
resource-management choices locally, regionally and globally" (cited from an article by Matt
BALL, in GeoWorld, Aug. 2005)
Best wishes,
Wolfgang
---------------------------------------
Wolfgang Lorenz
Faunistics and Environmental Planning
Hoermannstr. 4
D-82327 Tutzing, Germany
(P.S.: I'm currently running two test versions of PHP/MySQL map tools on my
websites; please contact me for details)
More information about the Taxacom
mailing list