[Taxacom] concept mapping (was Re: progress on globalnames.org)
Doug Yanega
dyanega at ucr.edu
Fri May 15 11:25:39 CDT 2009
Just a quick comment on the limits of concept mapping:
David Remsen wrote:
>I can imagine for
>example a tool like EoLs new refactoring of the uBio LinkIT (called
>MarkIT) that could make it really easy to find a name in a document or
>on a label, call a taxonomic catalog to find all treatments of that
>name (having dealiased the orthographic and nomenclatural bits) and
>provide a select list of relevant concepts that embed the taxon
>identifier into the usage. <snip>
>btw, I suspect there are many cases where a common name is more
>stable than a scientific name. I think scientific names are more
>precise because there are bodies governing their creation and use and
>they are ultimately tied to something real (types).
This all apparently assumes that the scientific names in question are
at generic rank or below; above that, the names are not so precise,
nor particularly stable. However, at that level, common names are
also very unstable as well, if only because membership in a group to
which a common name is applied changes over time, sometimes
drastically (I can think of dozens of insect families that were
either MUCH larger or MUCH smaller only 30 years ago - meaning both
the scientific and common name concepts have changed). Since we're
evidently talking about software that will recognize ANY rank of
scientific or common name, this isn't just a side-issue, correct?
Bear in mind, too, that many common names *sometimes* refer to a
single species, and *other* times the same name refers to a much
larger group - sometimes a group of things which aren't necessarily
related to one another (e.g. "daddy long-legs" or "cicada killer" or
extreme examples: "the wasp" or "the bee"). Some common names are
truly hopeless to concept-map, such as "bugs". Trying to get an
algorithm to do concept-mapping is inevitably going to run afoul of
MANY cases like these; cases where only a human being could possibly
know how to disambiguate a printed name, if only because an algorithm
has no way of establishing *context*. Is an algorithm that gives an
unambiguous result for only 50% (probably less) of all printed
organism names really - ultimately - going to be sufficient for what
we hope to accomplish? Speaking practically, who will sit down, read
all those papers, do the detective work, and disambiguate the
remaining 50% - and who will pay them to do this? Or are we limiting
the scope by excluding from consideration all documents that are not
classified as primary scientific literature?
Peace,
--
Doug Yanega Dept. of Entomology Entomology Research Museum
Univ. of California, Riverside, CA 92521-0314 skype: dyanega
phone: (951) 827-4315 (standard disclaimer: opinions are mine, not UCR's)
http://cache.ucr.edu/~heraty/yanega.html
"There are some enterprises in which a careful disorderliness
is the true method" - Herman Melville, Moby Dick, Chap. 82
More information about the Taxacom
mailing list