[Taxacom] Random taxonomy
JF Mate
aphodiinaemate at gmail.com
Fri Nov 29 14:46:48 CST 2013
Are we talking about the hypothetical 50 boxes and 50 labels example
or the real life example? Because they are completely different. One
has sampling with replacement and the other doesn´t.
Jason
On 29 November 2013 21:31, Peter Rauch <peterar at berkeley.edu> wrote:
> Making no assumptions about the likely relative abundance of the 50 species
> in nature, and about the relative likelihood of those species being
> collected, then ...
>
> The probability that each specimen would be correctly determined by
> randomly assigning one of the fifty names to to each specimen --i.e., pick
> one specimen from the 43 and randomly assign it a name from among the fifty
> names-- is 1 in 50.
>
> The probability of correctly naming the second specimen is exactly the
> same: 1 in 50.
>
> Etc. through all 43 specimens.
>
> [Note that by ignoring any assumptions, as stated above, the 1-in-50
> probability holds little credibility.)
>
>
> For the robot (i.e., the random process of assigning the name to the
> specimen) to "do the job as well as the human", the robot would need only
> identify one specimen correctly --i.e., a correctly named specimen only
> once among the 43 specimens.
>
> The probability of the robot doing that is actually very high (esp.
> relative to the notion that it is infinitely small, as some have suggested).
>
> However, because the question relates to a real situation --about actual
> blowflies collected in a particular country-- the assumption that each of
> the fifty species known to occur in that country is equally likely to be
> collected is probably a very weak assumption. More likely, some of the
> fifty species are very likely to be collected repeatedly, and other species
> will be rarely collected (this also assumes that blowfly collectors are not
> out collecting blowflies with a biased focus on obtaining particular
> species, or collecting in specific "habitats", etc).
>
> So, assuming that the likelihood that the frequency distribution of species
> represented in the collection of 43 specimens is more like the one found in
> nature, the game of simple random assignment of species names to a specimen
> is a worse case model; the model could be improved --to be more realistic--
> if the species names were being pulled randomly from a bucket of names that
> were found in the bucket with the same frequency as those species are
> encountered in nature.
>
> To look at it another way, this person could have named every one of the 43
> specimens with the name of the most common-occuring species in the country.
> Unless the collection of 43 specimens was built in a very biased manner, it
> is highly likely that ONE specimen would be correctly identified by the
> person.
>
> All in all, the answer to the problem is going to be quite suspect because
> of these various factors of biases being likely to come into play (making
> the worst case model a very poor representation of the reality).
>
> Peter
>
> On Fri, Nov 29, 2013 at 7:55 AM, Knut Rognes <knut at rognes.no> wrote:
>
>> Thanks to all for replying outside and within the list.
>>
>> My raising the question of random taxonomy was inspired by a real case
>> study. 43 specimens of blowflies was identified by a certain person. In the
>> person's country there are about 50 species of blowflies. All his
>> identifications was erroneous, except for one. My thought was then: Would a
>> robot have done better, given a label dispenser?
>>
>> Some replies I have got suggest that the robot might have done a job as
>> good
>> as the human.
>>
>> Knut
>>
>>
>>
>>
>> On 29 November 2013 11:24, Knut Rognes <knut at rognes.no> wrote:
>> > Dear Taxacomers,
>> >
>> >
>> >
>> > I have a statistical problem.
>> >
>> >
>> >
>> > Consider 50 black boxes, within each is a specimen of fly. Each fly
>> > has been identified by someone, its name written on the inside of the
>> > box, but this is invisible to you. You cannot peek inside. Each fly
>> > belong to one of 50 possible species.
>> >
>> >
>> >
>> > You have at your disposal the 50 possible species names for these
>> > flies, each name printed on an adhesive label, the supply of printed
>> > labels for each name is limitless.
>> >
>> >
>> >
>> > Here is the game: you affix a random label on the outside of a random
>> box.
>> >
>> >
>> >
>> > Now the problem: What is the likelihood that you put a correct label
>> > on the box, i.e. that the name on the label matches the identity of the
>> fly within?
>> >
>> >
>> >
>> > Knut Rognes
>> >
>> > Oslo, Norway
>>
>>
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom Archive back to 1992 may be searched with either of these methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>
> Celebrating 26 years of Taxacom in 2013.
More information about the Taxacom
mailing list