[Taxacom] iSpecies
Thompson, Alexander M
godfoder at acis.ufl.edu
Tue Jan 26 21:31:05 CST 2016
Whoops, i guess I failed at reading back through the message history properly. :(
iDigBio tries to do that for biodiversity data:
http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2 (current version)
http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2?version=0 (first version in iDigBio)
To do this, we had to focus on a hyperspecific prescription for our primary data type though. iDigBio's core type is not "specimen" or "occurrence" but the "record". We fully expected to have both multiple records for the same specimen present in the database, and records that were difficult to map back to a physical specimen (aggregated from symbiota portals records missing a collection assigned GUID). Both of these problems are getting less sever over time, as we work to get authoritative datasets online directly from the source with GUIDs assigned, and to remove overlapping datasets from other aggregators, but the record concept is still very important to us. Even deleted records don't vanish from iDigBio, although they do become impractically hard to track down if you don't know the GUID of the record - iDigBio staff with access to the database can do it, they just vanish from the indexes.
It is for this reason that we think we have a pretty good foundation on which to build an annotation system, we can capture full provenance and state information for what the annotation references with just the record UUID and version number (or etag to get even more specific).
- Alex
Aside: This also means we're capable of generating a complete data snapshot at an arbitrary point in our history should the need arise at some point in the future.
________________________________________
From: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
Sent: Tuesday, January 26, 2016 10:12 PM
To: taxacom; Thompson, Alexander M
Subject: Re: [Taxacom] iSpecies
>Lyubo's potential annotation that sparked this discussion, "this is an Australian species which is introduced in N.Z., and the only records currently displayed on the map are mine from around Auckland", is inherently temporal in nature<
Actually, that was my potential annotation, not Lyubo's! Forgive me while I just annotate that error! :)
Possibly the answer is to look at just about the only thing that Wikimedia gets right: Each Wikimedia article has a complete edit history from which you can display any previous version and know exactly who (username and/or IP) edited what when. So, something along these lines could be done for biodiversity related websites, whereby each version of a page would be a historical document (possibly even assigned a doi?) That way, a no longer relevant annotation would just be a part of a no longer current version of the page, but could be retrieved if desired.
Stephen
--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu> wrote:
Subject: Re: [Taxacom] iSpecies
To: "taxacom" <taxacom at mailman.nhm.ku.edu>, "Stephen Thorpe" <stephen_thorpe at yahoo.co.nz>
Received: Wednesday, 27 January, 2016, 3:56 PM
That's not quite what
I meant, although not entirely wrong either.
I think that iSpecies
shouldn't bother with annotations on _the thing that it
currently presents_ primarily because the concept of what
its referencing is somewhat inherently nebulous. What is the
point of collecting an annotation that is based on the
ephemeral concept of a collection of query results at a
given time. My assumption is that an annotation is something
more permanent than a blog comment, but less permanent than
a physical object or authoritative database record. A
"good" annotations denotes some kind of desired
transition on, or at very least clarification in the state
of, a more permanent object. I'll admit that that's
a limited proscription of the concept, and might betray the
aggregator centered nature of my thoughts.
Annotations on more ephemeral
things, like the current collection of GBIF query results
for a specific taxonKey, lose their value very rapidly.
Lyubo's potential annotation that sparked this
discussion, "this is an Australian species which is
introduced in N.Z., and the only records currently displayed
on the map are mine from around Auckland", is
inherently temporal in nature. This could of course be
remedied. You could save the generated map as an image,or
generate a download file from GBIF to get a DOI. Then you
could create the annotation on that more permanent object,
but doing so would require a much more complex
infrastructure than iSpecies currently implements.
iSpecies could also handle
annotations not for the species page itself but for the
underlying resources by passing the content through to the
other services that do have well defined concepts. If Google
Scholar gained the capability to support annotations on
texts, then iSpecies could be a good place to generate them,
especially if the Google Scholar annotations were queryable
in some form that could recursively enhance iSpecies. An
example would be capturing the some of the content in the
other current taxacom megathread as annotations on the
papers in question, allowing queries for whatever name
eventually ends up as synonym for the species to show
results for the paper with the accepted name, and vice
versa.
- Alex
________________________________________
From:
Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
Sent: Tuesday, January 26, 2016 8:11 PM
To: taxacom; Thompson, Alexander M
Subject: Re: [Taxacom] iSpecies
Alex,
You seem
to be saying that iSpecies should not bother with
annotations because other aggregators would have difficulty
dealing with the annotations? Is that correct? Seems a bit
naff!
Stephen
--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu>
wrote:
Subject: Re:
[Taxacom] iSpecies
To: "taxacom"
<taxacom at mailman.nhm.ku.edu>
Received: Wednesday, 27 January, 2016, 1:11
PM
I agree with Tony
here.
On two different fronts.
One, as a holder of data,
iDigBio is very
interested in collecting
these types of annotations for the
data
that we serve. So I would hope that any efforts to
build an annotation system into something like
iSpecies
would include provisions for the
ability to pass those
annotations down to
all information providers capable of
receiving them (as iDigBio plans to do).
Two, in terms of the general annotation
ecosystem, I think its very important that
annotation
systems are very clear about A)
what exactly the annotations
applies too,
and B) who is ultimately responsible for the
annotation. For A, without a nearly
universally recognized
and applied
identifier system (such as DOIs for papers) this
virtual requires that the service archive
copies of the
annotated resource. Even then
it is possible for annotations
on archived
content to effectively vanish if the system is
not very careful
to manage
churn. For B, I
firmly believe this
requires the nomination of a
organization
to be the default holder of annotation records.
This organization needs to be responsible for
ensuring that
"action" is taken
on all annotations, even if the
only viable
action is to simply display them alongside the
annotated content when it is queried. Other
organizations
could step up to the plate
and take more comprehensive
actions on
annotations (such as a collection modifying its
authoritative database), but in order to build
trust in the
system every accepted
annotation needs a guaranteed minimum
level
of service.
I think
that iSpecies as currently envisioned fails on
these fronts,
so annotations would not
really be an appropriate feature.
That
said, I think it's a neat little tool that has a
lot of potential, and could definitely evolve
to the point
(either by building features
internally, or by continuing to
incorporate
finer grained data from additional sources)
where annotation features would be both useful and
appropriate.
- Alex
iDigBio Infrastructure Lead
________________________________________
From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu>
on behalf of Tony Rees <tonyrees49 at gmail.com>
Sent: Tuesday, January 26, 2016 6:29 PM
To: David Campbell; taxacom
Subject: Re: [Taxacom] iSpecies
I don't think
"annotation" and an on-the-fly aggregator such
as
iSpecies
belong
together. As Rod is pointing
out, iSpecies
is basically a
demonstrator of
the fact that you can take an input species
name, throw
it
at a select
number of (hopefully
comprehensive)
taxonomic resources, and do
"something" with whatever comes back,
on-the-fly. The system is just a
piece of
code (large or
small) to do that job, and does not hold
any
content of its own (although arguably
a
list of taxonomic names and their
synonyms
might be helpful,
for query expansion, also homonyms, for
disaggregation...). So any annotations would
need to reside in the external
data sources
that the
"aggregator" queries.
Of course a step away from this model is to
start to hold actual content
locally for
query, then annotations *could*
be attached as desired
within
the iSpecies environment, but my
feeling is that this is outside the scope
of
Rod's present
"demonstrator" system. There is
a
conceptual progression
from a basic
system
as shown to something with a lot
more behind it
(databases, locally hosted
content,
annotations, lists and systems of
taxa,
etc.
etc.) which is
ultimately how you end up with the likes of
EOL,
however with substantially (read
lots!)
more investment in both IT
infrastructure,
editorial
input, and community engagement.
Personally (no disrespect to
Rod) if I wanted EOL content I would go to
EOL, GBIF content I would go to GBIF,
literature I would go to Google
Scholar
and
BHL at this time, but Rod is trying to
show that
"some" of the
required human mouse
clicks
can be automated at least (though that is
hardly a new message, thanks in part to his
original iSpecies of 2006 or
so). I
think
the value will be to see what else he
can do with the
system
to
produce a product, or some
interesting
value- adding, that is currently
*not*
available elsewhere with a few mouse
clicks.
Regards - Tony
Tony Rees, New South
Wales,
Australia
https://about.me/TonyRees
On 27 January 2016 at
09:25,
David Campbell <pleuronaia at gmail.com>
wrote:
>
Such
annotation, though requiring
appropriate moderation, has
particular
> advantages of being easy and
convenient for the people competent to make
> corrections. They're likely to
be
working on their own projects and not
>
have time or funding to
tackle a thorough review of a
dataset.
But they
> are likely to
search for information and, in the process,
spot
> misinformation. A quick way to
flag
problems, provide supplemental
info,
>
etc., and the
researcher is likely to contribute. If it
takes 10 minutes
> of searching
through
multiple links just to find a
possible way to submit a
> correction,
then there will be a lot fewer
edits
submitted. This applies
> for
many contexts - BHL could improve indexing if
readers could
flag
>
unrecognized scientific names and
false
positives, for example.
>
>
>
>
On Tue, Jan 26, 2016 at 3:26 PM,
Stephen Thorpe <
> stephen_thorpe at yahoo.co.nz>
> wrote:
>
> > Rod,
> >
> > The only way that this sort of
thing
is ever going to get beyond the
>
stage
> > of "garbage in, garbage
out" is by allowing free and unrestricted (but
> > moderated in case of spam)
annotation....
>
>
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be
searched at: http://taxacom.markmail.org
Celebrating 29 years of
Taxacom in 2016.
_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be
searched at: http://taxacom.markmail.org
Channeling Intellectual
Exuberance for 29 years in
2016.
More information about the Taxacom
mailing list