Catching lat lon in wrong country errors

P. Bryan Heidorn pheidorn at UIUC.EDU
Fri Jan 7 10:47:37 CST 2005


 From the information science perspective computing and
telecommunications should be integrated with human work practices
augmenting the work in just those places where humans have the most
difficulty and where the technology can provide some assistance. This
means we should look beyond a model where the human does one step and
the computer the second. We do not look at electronic spreadsheets as
the person doing one step and the computer another. Rather the person
uses the computer in many highly integrated tasks to perform some task
such as planning a budget.

In the georef task, people are prone to make typing errors but are
exceptionally good at visual processes such as reading maps. People are
slow at pulling up paper maps from the brick library. People are a
little faster at getting maps from digital libraries but we know that
can be tedious as well. With the correct technology in place the
computer can predict what might be needed and get it. That retrieval is
not the end of the story but just the beginning of a complex interplay
of operations, some human, some computer leading to some outcome. Tens
of millions of dollars have gone into electronic spreadsheet development
to bring it to the level where no accountant would imaging doing their
job without one. Fortunately for us, we have much more powerful tools
for system development than the first electronic spreadsheet developers
had. We are just at the very beginning of the process of developing the
equivalent for georeferencing work in biology.

Perhaps we can come up with a list of the basic set of tools that will
be needed to develop a highly productive human-computer environment to
performing this task. Some of the areas that have rightly been pointed
out as problems in this thread are certainly areas where we could start.

These include,
tools to improve quality of gazeteer
tools  for community verification and reuse of  information  (there is
no need for ten people to create the polygon for a country at one point
in time).
tools  for consistency checking (is that lat/lon  really in country X?)
tools for visualization (e.g. people can not tell which spatial
coordinate is out of place in a long list of lat/lon numbers but can
instantly see it on a map.)
tools to quickly exchange data across systems (this allows data reuse
but also this helps to fuel innovation allowing many approaches to the
same problem)
...

To ask "is it an automated procedure?" is not the same as asking, "Does
a human do it or does a computer do it?" Thinking of the spreadsheet
example, you can not say, "Is the person making the budget or is the
computer making the budget?" The human is using the computer to make the
budget.

-- Bryan Heidorn

--
--------------------------------------------------------------------
  P. Bryan Heidorn    Graduate School of Library and Information Science
  pheidorn at uiuc.edu   University of Illinois at Urbana-Champaign MC-493
  (V)217/ 244-7792    Rm. 221, 501 East Daniel St., Champaign, IL  61820-6212
  (F)217/ 244-3302    http://alexia.lis.uiuc.edu/~heidorn
  Online Calendar: http://tinyurl.com/6fd5q
  Visit the Biobrowser Web site at http://www.biobrowser.org


John Irish wrote:

> Doug Yanega wrote:
>
> > I would have thought the idea
> > was to use an automated procedure only *after* a human has done the
> > georeferencing, in an attempt to catch any errors the *human* may
> > have made.
>
> This is what I thought, too.
>
> Scenario: human inputs locality data into database. Assume this includes
> both coordinates (from whatever source) and some indication of the
> administrative unit the locality is situated in (whether country,
> province, district or whatever). While the human can read a map and
> type, he/she regularly misreads or mistypes. The program checks whether
> the given coordinates are actually within the borders of that country
> (or other unit) and alerts the human if it is not. The mistake can be
> rectified immediately - in fact, grossly wrong data never even enters
> the database.
>
> This is not automated georeferencing, rather, automated (partial)
> verification of the results of human georeferencing. Partial because it
> isn't foolproof - wrong coordinates in the right country will not be
> flagged as errors. You can improve machine accuracy by using polygons
> instead of bounding boxes, as I suggested, and by using smaller polygons
> (e.g. districts instead of provinces).
>
> > In essence, I'm curious as to whether my impression - that it sounds
> > like people are talking as if the idea is to have the programs doing
> > the primary georeferencing, and humans doing the error-checking,
> > rather than the other way around - is because I'm not understanding
> > what people are saying, or because they actually *believe* that's the
> > proper approach?
>
> Nope, not me.
>
> John
> --
> Dr. John Irish
> Tel./Fax +264-61-202-2038; Cell/SMS: +264-81-269-6602
> P.O. Box 21148, Windhoek
> Namibia Biosystematics Web Portal:
> http://www.biodiversity.org.na/index.php
>
> "The universe is full of magical things patiently waiting for our wits
> to grow sharper - E. Phillpots"




More information about the Taxacom mailing list