[Taxacom] iSpecies
Stephen Thorpe
stephen_thorpe at yahoo.co.nz
Tue Jan 26 22:05:26 CST 2016
Alex,
I have just now looked at iDigBio for the first time! I am a little bit alarmed at some things! They highlight the need for annotation! For example, you have a whole bunch of records for Austrastiba zealandica, collected from here in N.Z. The first problem is that there is no such thing as Austrastiba zealandica! In fact, there is no such genus as Austrastiba! It isn't quite clear what the taxon is suppose to be (there are lots of "zealandicas"!) Moreover, if one looks at a record for this nonexistent taxon (e.g. https://www.idigbio.org/portal/records/6e698c51-ca82-44d4-8301-ec60dfa66749), although it gives locality and repository information, I don't see anything to say who determined it to be that (nonexistent) species, and when! This is crucial information. The record means little or nothing without it.
Stephen
--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu> wrote:
Subject: Re: [Taxacom] iSpecies
To: "taxacom" <taxacom at mailman.nhm.ku.edu>, "Stephen Thorpe" <stephen_thorpe at yahoo.co.nz>
Received: Wednesday, 27 January, 2016, 4:31 PM
Whoops, i guess I failed
at reading back through the message history properly. :(
iDigBio tries to do that for
biodiversity data:
http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2
(current version)
http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2?version=0
(first version in iDigBio)
To do this, we had to focus on a hyperspecific
prescription for our primary data type though. iDigBio's
core type is not "specimen" or
"occurrence" but the "record". We fully
expected to have both multiple records for the same specimen
present in the database, and records that were difficult to
map back to a physical specimen (aggregated from symbiota
portals records missing a collection assigned GUID). Both of
these problems are getting less sever over time, as we work
to get authoritative datasets online directly from the
source with GUIDs assigned, and to remove overlapping
datasets from other aggregators, but the record concept is
still very important to us. Even deleted records don't
vanish from iDigBio, although they do become impractically
hard to track down if you don't know the GUID of the
record - iDigBio staff with access to the database can do
it, they just vanish from the indexes.
It is for this reason that we think we have a
pretty good foundation on which to build an annotation
system, we can capture full provenance and state information
for what the annotation references with just the record UUID
and version number (or etag to get even more specific).
- Alex
Aside: This also means we're capable of
generating a complete data snapshot at an arbitrary point in
our history should the need arise at some point in the
future.
________________________________________
From:
Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
Sent: Tuesday, January 26, 2016 10:12 PM
To: taxacom; Thompson, Alexander M
Subject: Re: [Taxacom] iSpecies
>Lyubo's potential
annotation that sparked this discussion, "this is an
Australian species which is introduced in N.Z., and the only
records currently displayed on the map are mine from around
Auckland", is inherently temporal in nature<
Actually, that was my
potential annotation, not Lyubo's! Forgive me while I
just annotate that error! :)
Possibly the answer is to look at just about
the only thing that Wikimedia gets right: Each Wikimedia
article has a complete edit history from which you can
display any previous version and know exactly who (username
and/or IP) edited what when. So, something along these lines
could be done for biodiversity related websites, whereby
each version of a page would be a historical document
(possibly even assigned a doi?) That way, a no longer
relevant annotation would just be a part of a no longer
current version of the page, but could be retrieved if
desired.
Stephen
--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu>
wrote:
Subject: Re:
[Taxacom] iSpecies
To: "taxacom"
<taxacom at mailman.nhm.ku.edu>,
"Stephen Thorpe" <stephen_thorpe at yahoo.co.nz>
Received: Wednesday, 27 January, 2016, 3:56
PM
That's not quite
what
I meant, although not entirely wrong
either.
I think that
iSpecies
shouldn't bother with
annotations on _the thing that it
currently
presents_ primarily because the concept of what
its referencing is somewhat inherently
nebulous. What is the
point of collecting
an annotation that is based on the
ephemeral concept of a collection of query results at a
given time. My assumption is that an
annotation is something
more permanent than
a blog comment, but less permanent than
a
physical object or authoritative database record. A
"good" annotations denotes some kind
of desired
transition on, or at very least
clarification in the state
of, a more
permanent object. I'll admit that that's
a limited proscription of the concept, and
might betray the
aggregator centered nature
of my thoughts.
Annotations on more ephemeral
things, like
the current collection of GBIF query results
for a specific taxonKey, lose their value very
rapidly.
Lyubo's potential annotation
that sparked this
discussion, "this is
an Australian species which is
introduced
in N.Z., and the only records currently displayed
on the map are mine from around
Auckland", is
inherently temporal in
nature. This could of course be
remedied.
You could save the generated map as an image,or
generate a download file from GBIF to get a
DOI. Then you
could create the annotation
on that more permanent object,
but doing so
would require a much more complex
infrastructure than iSpecies currently implements.
iSpecies could also handle
annotations not for the species page itself
but for the
underlying resources by passing
the content through to the
other services
that do have well defined concepts. If Google
Scholar gained the capability to support
annotations on
texts, then iSpecies could
be a good place to generate them,
especially if the Google Scholar annotations were
queryable
in some form that could
recursively enhance iSpecies. An
example
would be capturing the some of the content in the
other current taxacom megathread as
annotations on the
papers in question,
allowing queries for whatever name
eventually ends up as synonym for the species to show
results for the paper with the accepted name,
and vice
versa.
- Alex
________________________________________
From:
Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
Sent: Tuesday, January 26, 2016 8:11 PM
To: taxacom; Thompson, Alexander M
Subject: Re: [Taxacom] iSpecies
Alex,
You
seem
to be saying that iSpecies should not
bother with
annotations because other
aggregators would have difficulty
dealing
with the annotations? Is that correct? Seems a bit
naff!
Stephen
--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu>
wrote:
Subject: Re:
[Taxacom] iSpecies
To: "taxacom"
<taxacom at mailman.nhm.ku.edu>
Received: Wednesday, 27 January, 2016,
1:11
PM
I agree with Tony
here.
On two different fronts.
One, as a holder of data,
iDigBio is
very
interested in collecting
these types of annotations for the
data
that we serve. So I
would hope that any efforts to
build an
annotation system into something like
iSpecies
would include provisions for
the
ability to pass those
annotations down to
all
information providers capable of
receiving them (as iDigBio plans to do).
Two, in terms of the
general annotation
ecosystem, I think its
very important that
annotation
systems are very clear about A)
what exactly the annotations
applies too,
and B) who is
ultimately responsible for the
annotation. For A, without a nearly
universally recognized
and applied
identifier system (such as DOIs for papers)
this
virtual requires that the service
archive
copies of the
annotated resource. Even then
it is
possible for annotations
on archived
content to effectively vanish if the system
is
not very careful
to
manage
churn. For B, I
firmly believe this
requires the nomination
of a
organization
to be
the default holder of annotation records.
This organization needs to be responsible for
ensuring that
"action" is taken
on all
annotations, even if the
only viable
action is to simply display them alongside
the
annotated content when it is queried.
Other
organizations
could step up to the plate
and take more
comprehensive
actions on
annotations (such as a collection modifying its
authoritative database), but in order to
build
trust in the
system every accepted
annotation needs a
guaranteed minimum
level
of service.
I think
that iSpecies as currently envisioned fails
on
these fronts,
so
annotations would not
really be an
appropriate feature.
That
said, I think it's a neat little tool that
has a
lot of potential, and could
definitely evolve
to the point
(either by building features
internally, or by continuing to
incorporate
finer grained
data from additional sources)
where annotation features would be both useful
and
appropriate.
- Alex
iDigBio Infrastructure Lead
________________________________________
From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu>
on behalf of Tony Rees <tonyrees49 at gmail.com>
Sent: Tuesday, January 26, 2016 6:29 PM
To: David Campbell; taxacom
Subject: Re: [Taxacom]
iSpecies
I don't
think
"annotation" and an on-the-fly aggregator such
as
iSpecies
belong
together. As Rod is
pointing
out, iSpecies
is basically a
demonstrator of
the fact that you can take an input
species
name, throw
it
at a select
number of
(hopefully
comprehensive)
taxonomic resources, and do
"something" with
whatever comes back,
on-the-fly. The
system is just a
piece of
code (large or
small) to do
that job, and does not hold
any
content of its own
(although arguably
a
list of taxonomic names and their
synonyms
might be helpful,
for query expansion, also homonyms, for
disaggregation...). So any
annotations would
need to reside in the
external
data sources
that the
"aggregator" queries.
Of course a step away from
this model is to
start to hold actual
content
locally for
query, then annotations *could*
be attached as desired
within
the iSpecies environment, but
my
feeling is that this is outside the
scope
of
Rod's
present
"demonstrator" system.
There is
a
conceptual
progression
from a basic
system
as shown to something with a
lot
more behind it
(databases, locally hosted
content,
annotations, lists and systems of
taxa,
etc.
etc.) which is
ultimately
how you end up with the likes of
EOL,
however with
substantially (read
lots!)
more investment in both IT
infrastructure,
editorial
input, and community
engagement.
Personally
(no disrespect to
Rod) if
I wanted EOL content I would go to
EOL, GBIF content I would go to GBIF,
literature I would go to
Google
Scholar
and
BHL at this time, but Rod is trying to
show that
"some"
of the
required human mouse
clicks
can be automated at
least (though that is
hardly a new message, thanks in part to his
original iSpecies of 2006
or
so). I
think
the value will be to see what else he
can do with the
system
to
produce a product, or
some
interesting
value-
adding, that is currently
*not*
available elsewhere with a few mouse
clicks.
Regards - Tony
Tony Rees, New South
Wales,
Australia
https://about.me/TonyRees
On 27 January 2016 at
09:25,
David Campbell
<pleuronaia at gmail.com>
wrote:
>
Such
annotation,
though requiring
appropriate moderation,
has
particular
>
advantages of being easy and
convenient
for the people competent to make
>
corrections. They're likely to
be
working on their own projects and not
>
have time or funding
to
tackle a thorough review of a
dataset.
But they
> are likely to
search
for information and, in the process,
spot
> misinformation. A quick way
to
flag
problems,
provide supplemental
info,
>
etc., and the
researcher is likely to contribute. If it
takes 10 minutes
> of searching
through
multiple links just to find a
possible way to submit a
> correction,
then there will be a lot
fewer
edits
submitted.
This applies
> for
many contexts - BHL could improve indexing if
readers could
flag
>
unrecognized
scientific names and
false
positives, for example.
>
>
>
>
On
Tue, Jan 26, 2016 at 3:26 PM,
Stephen
Thorpe <
> stephen_thorpe at yahoo.co.nz>
> wrote:
>
> > Rod,
>
>
> > The only way that this
sort of
thing
is ever
going to get beyond the
>
stage
>
> of "garbage in, garbage
out" is by allowing free and unrestricted
(but
> > moderated in case of
spam)
annotation....
>
>
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be
searched at: http://taxacom.markmail.org
Celebrating 29 years of
Taxacom in 2016.
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be
searched at: http://taxacom.markmail.org
Channeling Intellectual
Exuberance for 29 years in
2016.
More information about the Taxacom
mailing list