geographical coordinates, GeoCoding, Lat/Long...

Doug Yanega dyanega at POP.UCR.EDU
Fri Feb 16 18:25:03 CST 2001


I was away for a week, but I see some threads have come up while I was gone
that I'd like to contribute to.

First off, the system we use here for error coding in locality data is one
that (obviously) I think is suitably informative without requiring a PhD to
encode, and one I haven't seen anyone propose (most folks seem to use exact
meters, or a "code"). Locations in our database are given to the nearest
degree, minute, or second, as appropriate and available, and errors are
stated in the same units, to distinguish genuine uncertainty from zeroes
(should some outside user need to convert the data in the future). For
example, if a label reads "Palm Canyon", and this geographic feature is
larger than a minute, then it is coded to degree only ("33 N, 116 W") and
the error field reads "degrees". That informs users that the entry does NOT
mean 33 00 00 N, 116 00 00 W. It means "anywhere within the box defined as
one degree of latitude and longitude between 33 and 34 N and 116 and 117
W". This allows for the wandering around of a collector relative to their
base camp, and other such concerns, and gives error at variable scales
without requiring precise metric breakdowns. Yes, it might be nicer to
calculate that Palm Canyon is exactly 6.3 miles long and 1.3 miles wide,
and encode the locality as four or more separate points defining a polygon
(even a circle is less than optimal, after all), but that starts getting
into details that might either not be easy for some hourly data-entry
worker, or not be worth their time (in addition to complicating the
database; after all, if you want to encode locality X as a 35-sided
polygon, you'll need lots of separate fields that will be blank in 99% of
the records in your database). The latter point is the practical reality of
geocoding: if you have a label that says something like "Wyoming", is it
*really* worth worrying about defining a polygon that includes every square
foot of the state? What I do, and tell data-entry people to do, is leave
the geocoding fields blank in a case that vague. If a label says only "Los
Angeles", it's nearly worthless to geocode it. Same reason why I see no
problem with leaving such a huge margin for error by defining "Palm Canyon"
only to the nearest degree - it's so vague to begin with, that worrying
about the picky details of the location are probably meaningless. Of
course, should it prove possible to find out by some other means where
exactly that collection event occurred, one can always update the database.
It is in fact normal for our database records to have far more accurate
geocodings than are indicated by the label data alone. This also relates to
the following note by Chris Thompson:

>The Entomological Collections Network years ago stated that retrospective
>data capture from specimens should only be done as part of the revisionary
>work of a specialist, when the identification can be verified and specialist
>can provide the best point estimation (or rectangle) of the original
>collection.

This is, essentially what the above system accomplishes; no record is
entered in such a way as to imply greater geocoding accuracy than is
practical to determine, and errs very much on the side of being
conservative. This brings me to Neils Snow's comment:

>A colleague in the Dept. of Geography here tells me decimal degrees are
>generally prefered by those engaged in remote sensing, and suspects
>decimal degrees will be used increasingly in the future, especially as
>regards GIS applications.

This is, I think, *extremely* dangerous, because if you see a decimal
coordinate of "33.00000 N, 116.00000 W" in a database, you won't know
whether it's PRECISELY at that point unless you also include an error
value, whereas having degrees, minutes, and seconds as separate numeric
fields allows you to enter the data so it is obvious *at first glance* that
there is significant error in the data. Maybe that's not an issue if no
human ever reads the data, but do all GIS applications know how to
incorporate error values, including those that are given as codes?

Peace,


Doug Yanega        Dept. of Entomology         Entomology Research Museum
Univ. of California - Riverside, Riverside, CA 92521
phone: (909) 787-4315 (standard disclaimer: opinions are mine, not UCR's)
           http://entmuseum9.ucr.edu/staff/yanega.html
  "There are some enterprises in which a careful disorderliness
        is the true method" - Herman Melville, Moby Dick, Chap. 82




More information about the Taxacom mailing list