[Taxacom] Random taxonomy
JF Mate
aphodiinaemate at gmail.com
Fri Nov 29 15:13:44 CST 2013
My mistake. I didn´t read you initial post carefully.
Jason
On 29 November 2013 22:05, Knut Rognes <knut at rognes.no> wrote:
> Both cases were meant to be the same. The supply of labels is infinite for
> each species name. (Which is sampling with replacement).
>
> Knut R
>
> -----Opprinnelig melding-----
> Fra: taxacom-bounces at mailman.nhm.ku.edu
> [mailto:taxacom-bounces at mailman.nhm.ku.edu] På vegne av JF Mate
> Sendt: 29. november 2013 21:47
> Til: Taxacom
> Emne: Re: [Taxacom] Random taxonomy
>
> Are we talking about the hypothetical 50 boxes and 50 labels example or the
> real life example? Because they are completely different. One has sampling
> with replacement and the other doesn´t.
>
> Jason
>
> On 29 November 2013 21:31, Peter Rauch <peterar at berkeley.edu> wrote:
>> Making no assumptions about the likely relative abundance of the 50
>> species in nature, and about the relative likelihood of those species
>> being collected, then ...
>>
>> The probability that each specimen would be correctly determined by
>> randomly assigning one of the fifty names to to each specimen --i.e.,
>> pick one specimen from the 43 and randomly assign it a name from among
>> the fifty
>> names-- is 1 in 50.
>>
>> The probability of correctly naming the second specimen is exactly the
>> same: 1 in 50.
>>
>> Etc. through all 43 specimens.
>>
>> [Note that by ignoring any assumptions, as stated above, the 1-in-50
>> probability holds little credibility.)
>>
>>
>> For the robot (i.e., the random process of assigning the name to the
>> specimen) to "do the job as well as the human", the robot would need
>> only identify one specimen correctly --i.e., a correctly named
>> specimen only once among the 43 specimens.
>>
>> The probability of the robot doing that is actually very high (esp.
>> relative to the notion that it is infinitely small, as some have
> suggested).
>>
>> However, because the question relates to a real situation --about
>> actual blowflies collected in a particular country-- the assumption
>> that each of the fifty species known to occur in that country is
>> equally likely to be collected is probably a very weak assumption.
>> More likely, some of the fifty species are very likely to be collected
>> repeatedly, and other species will be rarely collected (this also
>> assumes that blowfly collectors are not out collecting blowflies with
>> a biased focus on obtaining particular species, or collecting in specific
> "habitats", etc).
>>
>> So, assuming that the likelihood that the frequency distribution of
>> species represented in the collection of 43 specimens is more like the
>> one found in nature, the game of simple random assignment of species
>> names to a specimen is a worse case model; the model could be improved
>> --to be more realistic-- if the species names were being pulled
>> randomly from a bucket of names that were found in the bucket with the
>> same frequency as those species are encountered in nature.
>>
>> To look at it another way, this person could have named every one of
>> the 43 specimens with the name of the most common-occuring species in the
> country.
>> Unless the collection of 43 specimens was built in a very biased
>> manner, it is highly likely that ONE specimen would be correctly
>> identified by the person.
>>
>> All in all, the answer to the problem is going to be quite suspect
>> because of these various factors of biases being likely to come into
>> play (making the worst case model a very poor representation of the
> reality).
>>
>> Peter
>>
>> On Fri, Nov 29, 2013 at 7:55 AM, Knut Rognes <knut at rognes.no> wrote:
>>
>>> Thanks to all for replying outside and within the list.
>>>
>>> My raising the question of random taxonomy was inspired by a real
>>> case study. 43 specimens of blowflies was identified by a certain
>>> person. In the person's country there are about 50 species of
>>> blowflies. All his identifications was erroneous, except for one. My
>>> thought was then: Would a robot have done better, given a label
> dispenser?
>>>
>>> Some replies I have got suggest that the robot might have done a job
>>> as good as the human.
>>>
>>> Knut
>>>
>>>
>>>
>>>
>>> On 29 November 2013 11:24, Knut Rognes <knut at rognes.no> wrote:
>>> > Dear Taxacomers,
>>> >
>>> >
>>> >
>>> > I have a statistical problem.
>>> >
>>> >
>>> >
>>> > Consider 50 black boxes, within each is a specimen of fly. Each fly
>>> > has been identified by someone, its name written on the inside of
>>> > the box, but this is invisible to you. You cannot peek inside. Each
>>> > fly belong to one of 50 possible species.
>>> >
>>> >
>>> >
>>> > You have at your disposal the 50 possible species names for these
>>> > flies, each name printed on an adhesive label, the supply of
>>> > printed labels for each name is limitless.
>>> >
>>> >
>>> >
>>> > Here is the game: you affix a random label on the outside of a
>>> > random
>>> box.
>>> >
>>> >
>>> >
>>> > Now the problem: What is the likelihood that you put a correct
>>> > label on the box, i.e. that the name on the label matches the
>>> > identity of the
>>> fly within?
>>> >
>>> >
>>> >
>>> > Knut Rognes
>>> >
>>> > Oslo, Norway
>>>
>>>
>> _______________________________________________
>> Taxacom Mailing List
>> Taxacom at mailman.nhm.ku.edu
>> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>>
>> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
>>
>> (1) by visiting http://taxacom.markmail.org
>>
>> (2) a Google search specified as:
>> site:mailman.nhm.ku.edu/pipermail/taxacom your search terms here
>>
>> Celebrating 26 years of Taxacom in 2013.
>
> _______________________________________________
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom Archive back to 1992 may be searched with either of these
> methods:
>
> (1) by visiting http://taxacom.markmail.org
>
> (2) a Google search specified as: site:mailman.nhm.ku.edu/pipermail/taxacom
> your search terms here
>
> Celebrating 26 years of Taxacom in 2013.
>
>
More information about the Taxacom
mailing list