[Taxacom] Dark taxa: GenBank in a post-taxonomic world

Curtis Clark lists at curtisclark.org
Tue Apr 12 20:16:19 CDT 2011


On 2011-04-12 06:53, Roderic Page wrote:
> This post may be of interest to TAXACOM readers. "Dark taxa: GenBank in a post-taxonomic world"
>
> http://iphylo.blogspot.com/2011/04/dark-taxa-genbank-in-post-taxonomic.html
>

Apologies if a commenter mentioned this and I missed it: I think what 
you need is a null model.

Imagine a finite set of books, and a web site for cataloging them based 
on selected passages. Early in the history of the site, contributors 
will be working with intact books, that they can identify to title. If 
/Dracula/ has already been investigated and the sequence between the 
primers of "in example" and "spade of the sexton" has already been 
characterized, I'm unlikely to contribute the same sequence from a 
different physical book. But there are still lots of books, and I can 
sequence another.

After time, the number of easily available intact books that have not 
been sequenced starts to diminish. But there are book fragments, which 
can also be sequenced. Let's say I find a fragment that I can only 
characterize as "British English turn of the 20th C 67534567" and 
develop sequences. A search might suggest that some of them are very 
similar to the corresponding sequences in /Dracula/, but if I assume 
that my fragment is part of that book, either I fail to submit the 
sequence, and take the chance that a novel book (pun serendipitous) 
would go uncharacterized, or else submit a new sequence for an existing 
book, which might imply variation that doesn't exist.

Much better to submit it under its fragment identifier, and let others 
with greater knowledge in the future sort it out.Over time, more of the 
submitted sequences will be from book fragments that can't be easily 
identified.

It seems to be that any finite set with exemplars in various states of 
identifiability will produce the same sort of curves as the ones you've 
characterized.

-- 
--
Curtis Clark
Cal Poly Pomona





More information about the Taxacom mailing list