[Taxacom] Global biodiversity databases
Stephen Thorpe
stephen_thorpe at yahoo.co.nz
Thu Aug 9 16:04:51 CDT 2012
Actually, no, I don't think it would be helpful in the present context. Of course the fine print will spell out a much more modest goal than the "sales pitch", but it is the latter that I was commenting on. It's kind of like McDonalds advertising, where the burgers always look much more impressive than what you actually get. If this project isn't going to produce a meaningful phylogeny of all named species, then it shouldn't claim to be going to do so, not even in tweets ...
But, looking at the link you posted:
Build and make publicly available the first complete draft tree of life, capturing the depth of knowledge about biodiversity on Earth
I still have a problem with the use of the term "complete" in this context ...
Cheers, Stephen
From: Dean Pentcheff <pentcheff at gmail.com>
To: "taxacom at mailman.nhm.ku.edu" <taxacom at mailman.nhm.ku.edu>
Cc: Karen Cranston <karen.cranston at gmail.com>; Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
Sent: Friday, 10 August 2012 8:12 AM
Subject: Re: [Taxacom] Global biodiversity databases
It's possible that it would be helpful to actually read the proposal
of the Open Tree of Life project prior to speculating on its goals and
proposed techniques. That is available at:
http://opentree.wikispaces.com/
(Note that I have no connection with that project.)
-Dean
--
Dean Pentcheff
pentcheff at gmail.com
dpentche at nhm.org
On Wed, Aug 8, 2012 at 1:57 PM, Stephen Thorpe
<stephen_thorpe at yahoo.co.nz> wrote:
> Hi Karen,
> I guess my problem is with the sales pitch, i.e. [quote]comprehensive first-draft tree of all named species[unquote]
> This is ridiculously unrealistic (or else rather misleading)!
> Firstly, there doesn't even exist a fully comprehensive *listing* of all named species yet! CoL might imply in their sales pitch that they are close, but not true!
> Secondly, even if there did exist such a thing, the proportion of names for which phylogenies of any kind are available is very small, so the only way I can see OToL being able to do what it promises is to simply end up with an enormous unresolved polychotomy (at various levels), with a little bit of actual phylogenetic data buried somewhere inside! If, for example, you use CoL for weevil names, you will end up with the absurdity of presenting unresolved phylogenetic relationships between synonyms and the same species under different genera!
> Please explain ...
> Cheers,
> Stephen
>
> From: Karen Cranston <karen.cranston at gmail.com>
> To: Stephen Thorpe <stephen_thorpe at yahoo.co.nz>
> Cc: "Tony.Rees at csiro.au" <Tony.Rees at csiro.au>; "taxacom at mailman.nhm.ku.edu" <taxacom at mailman.nhm.ku.edu>
> Sent: Thursday, 9 August 2012 1:28 AM
> Subject: Re: [Taxacom] Global biodiversity databases
>
> Speaking with my Open Tree of Life hat on... We plan to have a first
> draft of the tree released in summer 2013, along with the ability for
> users to annotate nodes, upload new trees (to enable continuous
> updating), as well as an API for programmatic access. Where we can, we
> will use inferred phylogenies to construct the OpenTree. We will need
> to rely on taxonomies to fill in the gaps where we don't have
> phylogenetic coverage, and also for resolution of names in input
> trees. These are two key places where having centralized taxonomic
> resources would be a huge benefit.
>
> Cheers,
> Karen
>
> On Tue, Aug 7, 2012 at 10:37 PM, Stephen Thorpe
> <stephen_thorpe at yahoo.co.nz> wrote:
>> just noticed this on Twitter:
>>
>>>Open Tree of Life
>> @opentreeoflife
>> This NSF-funded project will produce the first online, comprehensive first-draft tree of all named species, accessible to both scientists. and the public.
>>
>> <
>>
>> *all named species*!! I note that they don't say when!
>>
>>
>> ________________________________
>> From: "Tony.Rees at csiro.au" <Tony.Rees at csiro.au>
>> To: stephen_thorpe at yahoo.co.nz; taxacom at mailman.nhm.ku.edu
>> Sent: Tuesday, 7 August 2012 2:07 PM
>> Subject: RE: [Taxacom] Global biodiversity databases
>>
>> Hi Stephen,
>>
>> Those who know me might appreciate that I have some interest in this area, e.g. see a couple of recent presentations:
>>
>> http://www.slideshare.net/tony1212/rees-an-all-genera-index
>> http://www.slideshare.net/tony1212/rees-towards-a-hierarchical-classification-of-all-life
>>
>>
>> Without knowing the subtext to your question(s), here are the answers I would give if pressed...
>>
>>
>>> Question 1: Do you expect a comprehensive and reliable GBD to exist in
>>> the foreseeable future (or do you think that one or more already
>>> exist)? If so, do you think it is likely to come from an existing
>>> initiative, and if so, which one(s)?
>>
>> I think you have to split this across short term vs. medium/longer term. Short term answer is that currently you have to do a mix-and-match across the best curated resources for specific groups: examples being Eschmeyer's Catalog of Fishes for the latter (extant species and genera), Index/Species Fungorum for fungi, Systema Dipterorum for Diptera, etc. etc.; notable cross-group compilations being Catalogue of Life (which is really a collation/fusion of 100+ "expert curated" systems), WoRMS (similar for some 20 contributing components) for marine species, and so on. For higher plants there is The Plant List, for algae AlgaeBase, for prokaryotes there is List of Prokaryotic names with Standing in Nomenclature (LPSN) plus CyanoDB, and for viruses there is the ICTV database. I would term these (with the exception of the composite CoL dataset and the Plant List) "primary aggregators" which ideally are the realm of experts in their respective fields (my 2
>> cents anyway).
>>
>> Medium to longer term there is the hope/wish/desire to move to an environment where as many as possible of these resources agree to collaborate in a common infrastructure, currently termed the Global Names Architecture or simply GN for Global Names. A recent meeting in Hawaii aimed to address some of the challenges to doing this, see http://www.globalnames.org/taxonomy/term/169/0 .
>>
>> Meanwhile while we wait for GN to deliver the "holy grail" there are secondary aggregators of which my project, ITIS, Wikispecies, Wikipedia and more might be cited as examples, taking material from the primary aggregators and original sources to build something more complete than any single source. Speaking from experience I do this without necessarily taxonomic expertise in any particular area, but hopefully some ability to make calls on which source to use or weight accordingly in the case of conflicting information. In some cases these secondary aggregators may also have a slightly different remit than the primary ones e.g. fleshing out with images, descriptive information or distributions absent from the purely nomenclatorial or bare-bones species lists. Whether these will also move into the GN space as discrete entities maintained separately for ever or will coalesce into a few larger units remains to be seen.
>>
>>> Question 2: Which would you prefer, (A) data verified by "experts"; or
>>> (2) data verifiable by the user (via referencing)?
>>
>> Answer would be both (see also examples given below). If an expert has made a call then that saves me (the user) doing the same! At the same time the more evidence which is included on which the call is based the better, so one can assess the currency and quality/credibility of that information, and as needed consider whether to utilise it unchanged or not (for example new taxonomic information may have been published since that call was made e.g. taxonomic placement, synonymy, name change etc.).
>>
>>> Question 3: What kinds of data do you want to be able to access from a
>>> GBD?
>>
>> A previous Taxacom post from Rod Page suggested the following:
>>
>> <snip>
>>
>> Very simple questions are being asked:
>>
>> 1. Is this a name?
>> 2. Is this the correct way to write it?
>> 3. Is this name currently in use?
>> 4. What other names are related to this name (e.g., synonyms, lexical variants)?
>> 5. Where was this name published? Can I see that publication?
>>
>> </snip>
>>
>> I would extend this a bit further:
>>
>> 6. What is the current (and also past) taxonomic placement of this name (+ according to...)
>> 7. What are its parent/children in the selected current taxonomic hierarchy
>> 8. What other names are lexically related to this name (homonyms, near-homonyms/candidate "did you mean")
>> 9. What do we know of the type specimen i.e. when/where collected, where deposited, geologic age, associated habitat etc.
>> 10. What do we know of the taxon to which this name applies - ecological info, distribution in space and time, common names, descriptive characters, significant literature treatments
>>
>> For a "straw man" here is an example species-level name treatment from Eschmeyer's online Catalog of Fishes, for the name Bythites hollisi (now a syn. of Thermichthys hollisi):
>>
>> <snip>
>> hollisi, Bythites Cohen [D. M.], Rosenblatt [R. H.] & Moser [H. G.] 1990:270, Figs. 1-8 [Deep-Sea Research v. 37 (no. 2); ref. 14223] Hydrothermal vent (Mussel Bed) on Galápagos Rift Zone, 0°47.894'N, 86°9.210'W, depth 2500 meters. Holotype (unique): SIO 88-97. .Valid as Bythites hollisi Cohen, Rosenblatt & Moser 1990 -- (Geistdoerfer 1999:9 [ref. 23832], Nielsen & Cohen in Nielsen et al. 1999:98 [ref. 24448], Machida & Hashimoto 2002:1 [ref. 25949], Chernova & Geistdorfer 2003:153 [ref. 26887]). .Valid as Gerhardia hollisi (Cohen, Rosenblatt & Moser 1990) -- (Nielsen & Cohen 2002:50 [ref. 26528]). .Valid as Thermichthys hollisi (Cohen, Rosenblatt & Moser 1990) -- (Nielsen & Cohen 2005:395 [ref. 28470]). Current status: Valid as Thermichthys hollisi (Cohen, Rosenblatt & Moser 1990). Bythitidae: Bythitinae. Distribution: Southeastern Pacific. Habitat: marine.
>> </snip>
>>
>> (also note that all the statements are referenced to a references table which can be searched independently).
>>
>> You can assess for yourself how much of my suggestions above are covered here. Some that are not may be covered by the equivalent entry in FishBase, see
>>
>> http://www.fishbase.org/summary/Thermichthys-hollisi.html
>>
>> (Actually this page is pretty bare compared with many in FishBase, but you will get the idea).
>>
>> For fossil taxa I think PaleoDB has pretty much the right approach, as an example see this page for the genus Tyrannosaurus:
>>
>> http://paleodb.org/cgi-bin/bridge.pl?a=basicTaxonInfo&taxon_no=38613 (there is a lot more information also available via "more details" as well)
>>
>>
>>> Question 4: Which existing initiative currently comes closest to what
>>> you would ideally like to see?
>>
>> See some examples above for particular groups (many more out there). Across all groups - either build your own (as I do) or use Google Scholar and Nomenclator Zoologicus (for animals) as a surrogate for the literature at this time, backed up by other internet/print resources as available (a personal library is still invaluable, especially for the more substantial texts). Wikipedia is surprisingly useful for recent updates on treatments of some groups and for the more "charismatic" taxa in general (the value of crowdsourcing I guess) but also beware of inaccuracies/inconsistencies between treatments on different pages, also very incomplete as minor taxa are not considered sufficiently "notable" I guess. (Why Wikipedia as opposed to Wikispecies? I guess I typically want more than the "bare bones" taxonomic placement and Wikispecies only supplies the latter).
>>
>> That's my take - maybe not quite what you are asking for, but maybe something useful there.
>>
>> Regards - Tony
>>
>> Tony Rees
>> Manager, Divisional Data Centre,
>> CSIRO Marine and Atmospheric Research,
>> GPO Box 1538,
>> Hobart, Tasmania 7001, Australia
>> Ph: 0362 325318 (Int: +61 362 325318)
>> Fax: 0362 325000 (Int: +61 362 325000)
>> e-mail: Tony.Rees at csiro.au
>> Manager, OBIS Australia regional node, http://www.obis.org.au/
>> Biodiversity informatics research activities: http://www.cmar.csiro.au/datacentre/biodiversity.htm
>> Personal info: http://www.fishbase.org/collaborators/collaboratorsummary.cfm?id=1566
>> LinkedIn profile: http://www.linkedin.com/pub/tony-rees/18/770/36
>>
>>> -----Original Message-----
>>> From: taxacom-bounces at mailman.nhm.ku.edu [mailto:taxacom-
>>> bounces at mailman.nhm.ku.edu] On Behalf Of Stephen Thorpe
>>> Sent: Tuesday, 7 August 2012 8:32 AM
>>> To: TAXACOM
>>> Subject: [Taxacom] Global biodiversity databases
>>>
>>> Dear Taxacomers,
>>> I have created a short questionnaire (below) for which I would
>>> appreciate greatly any replies. It concerns global biodiversity
>>> databases (GBDs) ("databases" in the broadest possible sense).
>>> Cheers, Stephen
>>>
>>> Question 1: Do you expect a comprehensive and reliable GBD to exist in
>>> the foreseeable future (or do you think that one or more already
>>> exist)? If so, do you think it is likely to come from an existing
>>> initiative, and if so, which one(s)?
>>> Question 2: Which would you prefer, (A) data verified by "experts"; or
>>> (2) data verifiable by the user (via referencing)?
>>> Question 3: What kinds of data do you want to be able to access from a
>>> GBD?
>>> Question 4: Which existing initiative currently comes closest to what
>>> you would ideally like to see?
>>> _______________________________________________
>>>
>>> Taxacom Mailing List
>>> Taxacom at mailman.nhm.ku.edu
>>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>>
>>> The Taxacom archive going back to 1992 may be searched with either of
>>> these methods:
>>>
>>> (1) by visiting http://taxacom.markmail.org/
>>>
>>> (2) a Google search specified as:
>>> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>> _______________________________________________
>>
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> The Taxacom archive going back to 1992 may be searched with either of these methods:
>>
>> (1) by visiting http://taxacom.markmail.org/
>>
>> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>
>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~
> karen.cranston at gmail.com
> ~~~~~~~~~~~~~~~~~~~~~~~
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either of these methods:
>
> (1) by visiting http://taxacom.markmail.org/
>
> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
More information about the Taxacom
mailing list