[Taxacom] iSpecies

Stephen Thorpe stephen_thorpe at yahoo.co.nz
Tue Jan 26 22:05:26 CST 2016


Alex,
I have just now looked at iDigBio for the first time! I am a little bit alarmed at some things! They highlight the need for annotation! For example, you have a whole bunch of records for Austrastiba zealandica, collected from here in N.Z. The first problem is that there is no such thing as Austrastiba zealandica! In fact, there is no such genus as Austrastiba! It isn't quite clear what the taxon is suppose to be (there are lots of "zealandicas"!) Moreover, if one looks at a record for this nonexistent taxon (e.g. https://www.idigbio.org/portal/records/6e698c51-ca82-44d4-8301-ec60dfa66749), although it gives locality and repository information, I don't see anything to say who determined it to be that (nonexistent) species, and when! This is crucial information. The record means little or nothing without it.
Stephen

--------------------------------------------
On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu> wrote:

 Subject: Re: [Taxacom] iSpecies
 To: "taxacom" <taxacom at mailman.nhm.ku.edu>, "Stephen Thorpe" <stephen_thorpe at yahoo.co.nz>
 Received: Wednesday, 27 January, 2016, 4:31 PM
 
 Whoops, i guess I failed
 at reading back through the message history properly. :(
 
 iDigBio tries to do that for
 biodiversity data:
 
 http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2
 (current version)
 http://api.idigbio.org/v2/view/records/23a41d15-6ab4-4b04-87a2-c92ab39858a2?version=0
 (first version in iDigBio)
 
 To do this, we had to focus on a hyperspecific
 prescription for our primary data type though. iDigBio's
 core type is not "specimen" or
 "occurrence" but the "record". We fully
 expected to have both multiple records for the same specimen
 present in the database, and records that were difficult to
 map back to a physical specimen (aggregated from symbiota
 portals records missing a collection assigned GUID). Both of
 these problems are getting less sever over time, as we work
 to get authoritative datasets online directly from the
 source with GUIDs assigned, and to remove overlapping
 datasets from other aggregators, but the record concept is
 still very important to us. Even deleted records don't
 vanish from iDigBio, although they do become impractically
 hard to track down if you don't know the GUID of the
 record - iDigBio staff with access to the database can do
 it, they just vanish from the indexes.
 
 It is for this reason that we think we have a
 pretty good foundation on which to build an annotation
 system, we can capture full provenance and state information
 for what the annotation references with just the record UUID
 and version number (or etag to get even more specific).
 
 - Alex
 
 Aside: This also means we're capable of
 generating a complete data snapshot at an arbitrary point in
 our history should the need arise at some point in the
 future. 
 ________________________________________
 From:
 Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
 Sent: Tuesday, January 26, 2016 10:12 PM
 To: taxacom; Thompson, Alexander M
 Subject: Re: [Taxacom] iSpecies
 
 >Lyubo's potential
 annotation that sparked this discussion, "this is an
 Australian species which is introduced in N.Z., and the only
 records currently displayed on the map are mine from around
 Auckland", is inherently temporal in nature<
 
 Actually, that was my
 potential annotation, not Lyubo's! Forgive me while I
 just annotate that error! :)
 
 Possibly the answer is to look at just about
 the only thing that Wikimedia gets right: Each Wikimedia
 article has a complete edit history from which you can
 display any previous version and know exactly who (username
 and/or IP) edited what when. So, something along these lines
 could be done for biodiversity related websites, whereby
 each version of a page would be a historical document
 (possibly even assigned a doi?) That way, a no longer
 relevant annotation would just be a part of a no longer
 current version of the page, but could be retrieved if
 desired.
 
 Stephen
 
 
 --------------------------------------------
 On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu>
 wrote:
 
  Subject: Re:
 [Taxacom] iSpecies
  To: "taxacom"
 <taxacom at mailman.nhm.ku.edu>,
 "Stephen Thorpe" <stephen_thorpe at yahoo.co.nz>
  Received: Wednesday, 27 January, 2016, 3:56
 PM
 
  That's not quite
 what
  I meant, although not entirely wrong
 either.
 
  I think that
 iSpecies
  shouldn't bother with
 annotations on _the thing that it
  currently
 presents_ primarily because the concept of what
  its referencing is somewhat inherently
 nebulous. What is the
  point of collecting
 an annotation that is based on the
 
 ephemeral concept of a collection of query results at a
  given time. My assumption is that an
 annotation is something
  more permanent than
 a blog comment, but less permanent than
  a
 physical object or authoritative database record. A
  "good" annotations denotes some kind
 of desired
  transition on, or at very least
 clarification in the state
  of, a more
 permanent object. I'll admit that that's
  a limited proscription of the concept, and
 might betray the
  aggregator centered nature
 of my thoughts.
 
 
 Annotations on more ephemeral
  things, like
 the current collection of GBIF query results
  for a specific taxonKey, lose their value very
 rapidly.
  Lyubo's potential annotation
 that sparked this
  discussion, "this is
 an Australian species which is
  introduced
 in N.Z., and the only records currently displayed
  on the map are mine from around
 Auckland", is
  inherently temporal in
 nature. This could of course be
  remedied.
 You could save the generated map as an image,or
  generate a download file from GBIF to get a
 DOI. Then you
  could create the annotation
 on that more permanent object,
  but doing so
 would require a much more complex
 
 infrastructure than iSpecies currently implements.
 
  iSpecies could also handle
  annotations not for the species page itself
 but for the
  underlying resources by passing
 the content through to the
  other services
 that do have well defined concepts. If Google
  Scholar gained the capability to support
 annotations on
  texts, then iSpecies could
 be a good place to generate them,
 
 especially if the Google Scholar annotations were
 queryable
  in some form that could
 recursively enhance iSpecies. An
  example
 would be capturing the some of the content in the
  other current taxacom megathread as
 annotations on the
  papers in question,
 allowing queries for whatever name
 
 eventually ends up as synonym for the species to show
  results for the paper with the accepted name,
 and vice
  versa.
 
  - Alex
 
 
 ________________________________________
 
 From:
  Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
  Sent: Tuesday, January 26, 2016 8:11 PM
  To: taxacom; Thompson, Alexander M
  Subject: Re: [Taxacom] iSpecies
 
  Alex,
  You
 seem
  to be saying that iSpecies should not
 bother with
  annotations because other
 aggregators would have difficulty
  dealing
 with the annotations? Is that correct? Seems a bit
  naff!
  Stephen
 
 
 --------------------------------------------
  On Wed, 27/1/16, Thompson, Alexander M <godfoder at acis.ufl.edu>
  wrote:
 
  
 Subject: Re:
  [Taxacom] iSpecies
   To: "taxacom"
 
 <taxacom at mailman.nhm.ku.edu>
   Received: Wednesday, 27 January, 2016,
 1:11
  PM
 
  
 I agree with Tony
  here.
  
 On two different fronts.
 
  
 One, as a holder of data,
  iDigBio is
 very
   interested in collecting
  these types of annotations for the
   data
  that we serve. So I
 would hope that any efforts to
   build an
 annotation system into something like
 
 iSpecies
   would include provisions for
 the
  ability to pass those
   annotations down to
  all
 information providers capable of
 
  receiving them (as iDigBio plans to do).
 
   Two, in terms of the
 general annotation
   ecosystem, I think its
 very important that
  annotation
   systems are very clear about A)
  what exactly the annotations
   applies too,
  and B) who is
 ultimately responsible for the
  
 annotation. For A, without a nearly
 
 universally recognized
   and applied
  identifier system (such as DOIs for papers)
 this
   virtual requires that the service
 archive
  copies of the
  
 annotated resource. Even then
  it is
 possible for annotations
   on archived
  content to effectively vanish if the system
 is
   not very careful
   to
 manage
  churn. For B, I
  
 firmly believe this
  requires the nomination
 of a
   organization
  to be
 the default holder of annotation records.
  
 This organization needs to be responsible for
  ensuring that
  
 "action" is taken
  on all
 annotations, even if the
   only viable
  action is to simply display them alongside
 the
   annotated content when it is queried.
 Other
  organizations
  
 could step up to the plate
  and take more
 comprehensive
   actions on
 
 annotations (such as a collection modifying its
   authoritative database), but in order to
 build
  trust in the
  
 system every accepted
  annotation needs a
 guaranteed minimum
   level
 
 of service.
 
   I think
   that iSpecies as currently envisioned fails
 on
  these fronts,
   so
 annotations would not
  really be an
 appropriate feature.
   That
  said, I think it's a neat little tool that
 has a
   lot of potential, and could
 definitely evolve
  to the point
   (either by building features
  internally, or by continuing to
   incorporate
  finer grained
 data from additional sources)
 
  where annotation features would be both useful
 and
   appropriate.
 
 
  - Alex
  
 iDigBio Infrastructure Lead
 
 
 
 ________________________________________
 
  From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu>
   on behalf of Tony Rees <tonyrees49 at gmail.com>
   Sent: Tuesday, January 26, 2016 6:29 PM
   To: David Campbell; taxacom
 
  Subject: Re: [Taxacom]
 iSpecies
 
   I don't
 think
 
 
 "annotation" and an on-the-fly aggregator such
  as
   iSpecies
   belong
  together. As Rod is
 pointing
   out, iSpecies
 
 is basically a
   demonstrator of
   the fact that you can take an input
 species
  name, throw
  
 it
   at a select
  number of
 (hopefully
   comprehensive)
  taxonomic resources, and do
 
  "something" with
 whatever comes back,
   on-the-fly. The
 system is just a
   piece of
   code (large or
  small) to do
 that job, and does not hold
 
  any
   content of its own
 (although arguably
  a
  
 list of taxonomic names and their
  
 synonyms
   might be helpful,
  for query expansion, also homonyms, for
 
  disaggregation...). So any
 annotations would
   need to reside in the
 external
   data sources
  
 that the
  "aggregator" queries.
 
   Of course a step away from
 this model is to
   start to hold actual
 content
 
  locally for
   query, then annotations *could*
  be attached as desired
  
 within
   the iSpecies environment, but
 my
   feeling is that this is outside the
 scope
   of
   Rod's
 present
  "demonstrator" system.
 There is
   a
  conceptual
 progression
   from a basic
 
 system
   as shown to something with a
 lot
  more behind it
  
 (databases, locally hosted
  content,
   annotations, lists and systems of
  taxa,
   etc.
   etc.) which is
  ultimately
 how you end up with the likes of
 
  EOL,
   however with
 substantially (read
  lots!)
   more investment in both IT
   infrastructure,
  
 editorial
  input, and community
 engagement.
 
   Personally
 (no disrespect to
 
  Rod) if
 I wanted EOL content I would go to
 
  EOL, GBIF content I would go to GBIF,
 
  literature I would go to
 Google
   Scholar
  and
   BHL at this time, but Rod is trying to
  show that
   "some"
 of the
   required human mouse
   clicks
  can be automated at
 least (though that is
 
 
 hardly a new message, thanks in part to his
 
  original iSpecies of 2006
 or
   so). I
  think
   the value will be to see what else he
  can do with the
   system
   to
  produce a product, or
 some
   interesting
  value-
 adding, that is currently
   *not*
  available elsewhere with a few mouse
 
  clicks.
 
   Regards - Tony
 
   Tony Rees, New South
 
 Wales,
   Australia
   https://about.me/TonyRees
 
   On 27 January 2016 at
  09:25,
   David Campbell
 <pleuronaia at gmail.com>
   wrote:
 
  
 >
  Such
   annotation,
 though requiring
  appropriate moderation,
 has
   particular
   >
 advantages of being easy and
   convenient
 for the people competent to make
   >
 corrections.  They're likely to
  be
   working on their own projects and not
   >
   have time or funding
 to
  tackle a thorough review of a
   dataset.
  But they
   > are likely to
   search
 for information and, in the process,
 
 spot
   > misinformation.  A quick way
 to
  flag
   problems,
 provide supplemental
  info,
   >
   etc., and the
  researcher is likely to contribute.  If it
 
  takes 10 minutes
   > of searching
 
 through
   multiple links just to find a
  possible way to submit a
  
 > correction,
  then there will be a lot
 fewer
   edits
  submitted. 
 This applies
   > for
  
 many contexts - BHL could improve indexing if
  readers could
   flag
   >
  unrecognized
 scientific names and
   false
  positives, for example.
  
 >
   >
   >
 
  >
   On
 Tue, Jan 26, 2016 at 3:26 PM,
  Stephen
 Thorpe <
   > stephen_thorpe at yahoo.co.nz>
   > wrote:
   >
   > > Rod,
   >
 >
   > > The only way that this
 sort of
  thing
   is ever
 going to get beyond the
   >
   stage
 
  >
 > of "garbage in, garbage
 
  out" is by allowing free and unrestricted
 (but
   > > moderated in case of
 spam)
   annotation....
  
 >
   >
 
 
 _______________________________________________
   Taxacom Mailing List
   Taxacom at mailman.nhm.ku.edu
   http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
   The Taxacom Archive back to 1992 may be
   searched at: http://taxacom.markmail.org
 
   Celebrating 29 years of
   Taxacom in 2016.
 
 
 _______________________________________________
   Taxacom Mailing List
   Taxacom at mailman.nhm.ku.edu
   http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
   The Taxacom Archive back to 1992 may be
   searched at: http://taxacom.markmail.org
 
   Channeling Intellectual
   Exuberance for 29 years in
 
 2016.



More information about the Taxacom mailing list