[Taxacom] Why stability? - Revisited

Fri May 1 13:14:58 CDT 2015

Hi Alan,

The comment about "munging" (not even sure if that's a word, I'm a Kiwi by the way) was specific to Avibase, where it takes multiple checklists (each may have several versions, so there is a lot of self similarity) and synthesising them.

I'm not denying that this is valuable, but it frustrates me that there is minimal connection to the underlying literature. What I see missing from many checklists, and aggregators as well, is the ability to drill down to the underlying science.

Rod

Sent from my iPhone

On Fri, May 1, 2015 at 10:55 AM -0700, "Weakley, Alan" <weakley at bio.unc.edu<mailto:weakley at bio.unc.edu>> wrote:

While I will have a more detailed and lengthy response later (when I have time), here's a quicky:

One has to love the brilliant ;-) pejorative Britishism "it's just munging together checklists".  ;-)

All taxonomy work should be based on the most thorough, careful, and expert science possible (monographs and their like).  Most taxonomic work is then translated to a broader set of scientific users via (in the vascular plant world) Floras and other, more practical "field guides" -- for at least the more conspicuous organism groups; admittedly it helps if you are a vertebrate animal, a charismatic invertebrate (like Lepidoptera, Odonata, Hymenoptera), or a vascular plant.  The set of users of a monograph is in the 10s (maybe the 100s for EVEN a bird or mammal monograph).  The set of users of a Flora or regional Field Guide to an animal group is in the 10,000s.  The set of users of a website is in the 1,000,000s.  It is only by dealing effectively with ambiguities between taxonomy and nomenclature that we go from 10's to 1,000,000s with accuracy and real meaning.  Don't you want the best information used (for ongoing scientific work, for conservation, for ecological studies, for citizen science, for ____)?

So, let me offer a provocative translation here:  "it's just munging together checklists" --> "it's just making the best, current, taxonomically accurate information accessible to the broad set of users who need it".

-----Original Message-----
From: Taxacom [mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Roderic Page
Sent: Friday, May 01, 2015 1:10 PM
To: TAXACOM
Cc: Nico Franz
Subject: Re: [Taxacom] Why stability? - Revisited

Hi Nico,

Just to play devils advocate, as much as Avibase is an impressive achievement (I’m playing with some data from it right now), at the end of the day it’s basically munging together checklists. There’s no evidence base that we can access, we are essentially combining opinions on what species or subspecies go where. Some of these checklists are literally just lists of names, representing somebody’s - no doubt considered - opinion, whereas I’d really like to see why someone thinks two taxa are synonyms, or one species should be split into two, etc. What is the, you know, actual evidence?

I believe that, if an individual produces a monograph that has well defined reference boundaries - a domain of reference, so to speak (this perceived taxon, at this time, in that region, given this nomenclatural and taxonomic legacy, these sets of specimens, traits, inferred trees, etc.) - and that monograph gets aggregated into a larger biodiversity information environment, then in that environment the identity of the monographic content should remain "relevantly recognizable". The aggregator environment does in effect expand the monograph's original domain of reference in ways that the monograph's author cannot readily or reliably predict.

…

This will sound a bit dramatic, but many aggregator systems are currently structurally designed in a way that the graduate student, postdoc, or more senior scientist producing a monograph is inadvertently disenfranchised when their taxonomic language contribution migrates from the traditional to the integrative publication environment.

I find the notion that monographs are monolithic entities with boundaries to be respected to be a little last century ;) I would like traceability of evidence, but this doesn’t require a monograph as such. We could have single, citable assertions (say, equivalent to a single paper that shows what was thought to be a new species was actually simply the male of a known species), or we could have a set of assertions, each individually identifiable but all clustered as coming from the same monograph. In other words, nano publications, which may be aggregated into larger sets if desired. I suspect this is the way a lot of data curation subjects, such as taxonomy, are going to be heading in.

As always there seems to be a tension between doing things the way we always have, albeit using new technology, or using new technology to rethink they way we do things. I don’t mean it as pejoratively as that sounds - new isn’t always necessarily better, but I think we are missing opportunities to rethink the way we do things.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page

On 1 May 2015, at 16:22, Nico Franz <nico.franz at asu.edu<mailto:nico.franz at asu.edu>> wrote:

Thanks, Rod (and Tony).

   Also for steering things back a bit.

   I believe that, if an individual produces a monograph that has well defined reference boundaries - a domain of reference, so to speak (this perceived taxon, at this time, in that region, given this nomenclatural and taxonomic legacy, these sets of specimens, traits, inferred trees, etc.) - and that monograph gets aggregated into a larger biodiversity information environment, then in that environment the identity of the monographic content should remain "relevantly recognizable". The aggregator environment does in effect expand the monograph's original domain of reference in ways that the monograph's author cannot readily or reliably predict.

   To me his puts the onus on the aggregator environment to provide technical design solutions that are capable of supporting the communication and social recognition models that human taxonomy making and revising relies on.

   Where do taxonomic concepts fit in here? We have, at this point, some individual efforts (two absolute stand-outs to me are Lepage's Avibase [http://zookeys.pensoft.net/articles.php?id=3906] and Weakley's Flora [http://www.herbarium.unc.edu/flora.htm]) that demonstrate at considerable scales (thousands of currently recognized species concepts, > one century taxonomy legacy depth, tens of thousands to millions of articulations) that taxonomic concept individuation and integration based on semantics that complement nomenclatural relationships is feasible. Avibase in particular implements a database to sustain these reference services.

   I think a fair and contemporary assessment is, as we move to greater, more integrative scales, there will be issues that we have not fully grasped yet, and other issues that we can already identify and which will be hard. For instance, I understand that Avibase uses taxonomic names at the family level and above, while shifting to taxonomic concept resolution at lower levels. But we also do have a small but growing body of theory and practice that shows feasibility and value, to my mind. Worthy of praise perhaps, and further exploration.

   The following is in my view a persistent challenge to the aggregators. When we initially build these larger biodiversity data repositories with successively more encompassing taxonomies whose intellectual authorship origins are diverse, and then curate the taxonomies in the new environments as we go along, we are in some sense generating new systematic theories intended to reflect reference standards for a wide range of contributors and users.

http://link.springer.com/article/10.1007%2Fs13752-012-0049-z

   But who owns the new theories, or identifiable parts of them? Who can express their assessments of their validity, or perceived need for correction or expansion? This will sound a bit dramatic, but many aggregator systems are currently structurally designed in a way that the graduate student, postdoc, or more senior scientist producing a monograph is inadvertently disenfranchised when their taxonomic language contribution migrates from the traditional to the integrative publication environment.

   So, yes, we do not have it all figured out. Maybe it won't work in the end for very many important applications. We are also not alone in this.

http://link.springer.com/chapter/10.1007%2F978-1-84628-901-9_8

Cheers, Nico

On Fri, May 1, 2015 at 3:31 AM, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:
Hi Nico,

To return to you’re original post and question, a couple of quick comments.

As Stephen Thorpe alluded to, once aspect of instability is IMHO a function of the burden taxonomic names carry. We would like:

1. human readable, globally unique names, that

2. also tell us something about relationships (e.g. the genus name matters), and

3. carry some link to provenance (e.g., taxonomic authority, author for new combinations, etc.)

There’s pretty much no way to satisfy these requirements without tradeoffs of one sort or another. For example, for reasons that I’ve now forgotten I thought it would be fun to try and track down the original species descriptions associated with a recent paper on the declining rate of descriptions of new bird species ( http://dx.doi.org/10.1093/sysbio/syu069, see also http://eol.org/collections/116394 ). Cue much heartache as many of these names have changed, and often discovering the original name (and publication) is a world of hurt as people shuffle species between genera and up and down between species and subspecies rank (e.g., http://bionames.org/names/cluster/642623 ).

We have a naming system that is hugely unstable because goals 1 and 2 are incompatible (at least, they are in the absence of any system to track name changes, botanists do this quite well, zoologists don’t).

Regarding your bigger point about your “extreme” system, I think this is kind of where we are heading, especially when you think of things like DNA barcoding. However, I suspect that what people will focus on is not the long history of shuffling specimens between names and taxa, but what the latest snap shot is "right now". Databases that make this explicit (GBIF - taxa as sets of occurrences, NCBI and BOLD - taxa as sets of sequences) will be useful and underpin actual research. Databases that make this implicit (i.e., most taxonomic databases) will be a lot less useful.

I love the taxonomic legacy as much as anyone, indeed I spend most of my time trying to expose it as much as possible (hence http://biostor.org<http://biostor.org/> and http://bionames.org<http://bionames.org/> ), but I suspect a lot of discussion about the relationship between concepts will be of perhaps limited relevance except in some (possibly spectacular) edges cases.

Regards

Rod

---------------------------------------------------------
Roderic Page
Professor of Taxonomy
Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK

Email:  Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>
Tel:  +44 141 330 4778<tel:%2B44%20141%20330%204778>
Skype:  rdmpage
Facebook:  http://www.facebook.com/rdmpage
LinkedIn:  http://uk.linkedin.com/in/rdmpage
Twitter:  http://twitter.com/rdmpage
Blog:  http://iphylo.blogspot.com<http://iphylo.blogspot.com/>
ORCID:  http://orcid.org/0000-0002-7101-9767
Citations:  http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
ResearchGate https://www.researchgate.net/profile/Roderic_Page

_______________________________________________
Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
The Taxacom Archive back to 1992 may be searched at: http://taxacom.markmail.org

Celebrating 28 years of Taxacom in 2015.