geographical coordinates, GeoCoding, Lat/Long...

Dr. Gerald Stinger Guala stinger at FAIRCHILDGARDEN.ORG
Sat Feb 17 15:17:11 CST 2001


Doug,

You make some good points but the following must be pointed out. If I
understand your system, you are essentially using significant figures as
your precision scale. In my experience this works well for the first use of
the data, however, the minute that the data are transferred or interpreted
(say in a Z39.50 interface to a conglomerate database or to a GIS) it is
lost in raw form. Your "error" field stays with it but that brings up the
point of scale ranking. You are still using a ranked code, it just happens
to be deg., min., sec. Therefore, if you have 4 ranks (deg., min., sec,
nothing) you actually have a rather course system. Ours is only five ranks,
but they are five ranks that were designed to be along easily understood
natural breaks in the data (collectors usually record at the political
boundary level). Also, I have found the "Wyoming" designation to be very
useful in several cases because I generally work at a global scale. I do
mapping mostly with ArcView which is easiest to encode error with post hoc.
It even has a buffer feature that directly incorporates it. With dynamic
mapping on the web I just use the size of the dot parameter (e.g. a dot of
size five is a much biggger area than of size one) but I admit that deg.
min. sec. would work better for this.

Stinger

Gerald "Stinger" Guala, Ph.D.
Keeper of the Herbarium
Coordinator of the Program in Tropical Plant Systematics
Fairchild Tropical Garden Research Center
11935 Old Cutler Rd.
Coral Gables, FL 33156-4299

www.virtualherbarium.org

-----Original Message-----
From: Taxacom Discussion List [mailto:TAXACOM at USOBI.ORG]On Behalf Of
Doug Yanega
Sent: Friday, February 16, 2001 9:25 PM
To: TAXACOM at USOBI.ORG
Subject: Re: geographical coordinates, GeoCoding, Lat/Long...


I was away for a week, but I see some threads have come up while I was gone
that I'd like to contribute to.

First off, the system we use here for error coding in locality data is one
that (obviously) I think is suitably informative without requiring a PhD to
encode, and one I haven't seen anyone propose (most folks seem to use exact
meters, or a "code"). Locations in our database are given to the nearest
degree, minute, or second, as appropriate and available, and errors are
stated in the same units, to distinguish genuine uncertainty from zeroes
(should some outside user need to convert the data in the future). For
example, if a label reads "Palm Canyon", and this geographic feature is
larger than a minute, then it is coded to degree only ("33 N, 116 W") and
the error field reads "degrees". That informs users that the entry does NOT
mean 33 00 00 N, 116 00 00 W. It means "anywhere within the box defined as
one degree of latitude and longitude between 33 and 34 N and 116 and 117
W". This allows for the wandering around of a collector relative to their
base camp, and other such concerns, and gives error at variable scales
without requiring precise metric breakdowns. Yes, it might be nicer to
calculate that Palm Canyon is exactly 6.3 miles long and 1.3 miles wide,
and encode the locality as four or more separate points defining a polygon
(even a circle is less than optimal, after all), but that starts getting
into details that might either not be easy for some hourly data-entry
worker, or not be worth their time (in addition to complicating the
database; after all, if you want to encode locality X as a 35-sided
polygon, you'll need lots of separate fields that will be blank in 99% of
the records in your database). The latter point is the practical reality of
geocoding: if you have a label that says something like "Wyoming", is it
*really* worth worrying about defining a polygon that includes every square
foot of the state? What I do, and tell data-entry people to do, is leave
the geocoding fields blank in a case that vague. If a label says only "Los
Angeles", it's nearly worthless to geocode it. Same reason why I see no
problem with leaving such a huge margin for error by defining "Palm Canyon"
only to the nearest degree - it's so vague to begin with, that worrying
about the picky details of the location are probably meaningless. Of
course, should it prove possible to find out by some other means where
exactly that collection event occurred, one can always update the database.
It is in fact normal for our database records to have far more accurate
geocodings than are indicated by the label data alone. This also relates to
the following note by Chris Thompson:

>The Entomological Collections Network years ago stated that retrospective
>data capture from specimens should only be done as part of the revisionary
>work of a specialist, when the identification can be verified and
specialist
>can provide the best point estimation (or rectangle) of the original
>collection.

This is, essentially what the above system accomplishes; no record is
entered in such a way as to imply greater geocoding accuracy than is
practical to determine, and errs very much on the side of being
conservative. This brings me to Neils Snow's comment:

>A colleague in the Dept. of Geography here tells me decimal degrees are
>generally prefered by those engaged in remote sensing, and suspects
>decimal degrees will be used increasingly in the future, especially as
>regards GIS applications.

This is, I think, *extremely* dangerous, because if you see a decimal
coordinate of "33.00000 N, 116.00000 W" in a database, you won't know
whether it's PRECISELY at that point unless you also include an error
value, whereas having degrees, minutes, and seconds as separate numeric
fields allows you to enter the data so it is obvious *at first glance* that
there is significant error in the data. Maybe that's not an issue if no
human ever reads the data, but do all GIS applications know how to
incorporate error values, including those that are given as codes?

Peace,


Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California - Riverside, Riverside, CA 92521
phone: (909) 787-4315 (standard disclaimer: opinions are mine, not UCR's)
           http://entmuseum9.ucr.edu/staff/yanega.html
  "There are some enterprises in which a careful disorderliness
        is the true method" - Herman Melville, Moby Dick, Chap. 82




More information about the Taxacom mailing list