[Taxacom] data quality vs. data security: a survey

Stephen Thorpe s.thorpe at auckland.ac.nz
Sat Feb 13 16:12:44 CST 2010


Hi Richard,

>You and I just seem to have different ideas about how best to achieve that goal

On that I can agree with you 100%, but it doesn't at all imply any sort of subjectivity of opinion. At least one of us is wrong in our ideas about this! Maybe it is me? Maybe I am wrong that the maximum benefit to the world from any contribution you make to taxonomy, is for you to take a few minutes to add the basic info and links (and images, if available) in a sensible integrative manner to Wikispecies, and not give the parasites any more blood to suck! On the other hand, maybe I'm not wrong ...

A related issue which I find "very odd" is giving unique identifiers to taxon names, when taxon names are already unique identifers for taxa (and when that does go astray due to homonymy, it is easily fixed by replacement name). A machine can read a string of meaningful text as easily as a string of meaningless numbers/symbols, so why the heck is so much effort going into creating strings of numbers/symbols as surrogates for taxon names??? We should just standardise the namestring! Just another pointless beauracratic money-go-round ... Same for references - I applaud Zootaxa for not wanting to enter into the DOI money-go-round, which would have resulted in less new taxonomy being published due to more time/money being wasted on pointless beauracracy. Though "pointless" is a value judgement relative to one's values (different values may include (1) furtherance of human knowledge, or (2) economic prosperity, in varying proportions). Do you and I have different values, Richard?

Stephen

________________________________________
From: taxacom-bounces at mailman.nhm.ku.edu [taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Richard Pyle [deepreef at bishopmuseum.org]
Sent: Sunday, 14 February 2010 10:50 a.m.
To: 'TAXACOM'
Subject: Re: [Taxacom] data quality vs. data security: a survey

Hi Stephen,

I guess we just have different perspectives of where the "serious"
duplication-of-effort problems are in our community.

Speaking as a hard-working, dedicated taxonomist myself (when I can find the
time...), I want to make sure that any contribution I make to taxonomy is
maximally available to all current and future interested parties.  You and I
just seem to have different ideas about how best to achieve that goal.

Aloha,
Rich

> -----Original Message-----
> From: Stephen Thorpe [mailto:s.thorpe at auckland.ac.nz]
> Sent: Saturday, February 13, 2010 11:45 AM
> To: Richard Pyle; 'TAXACOM'
> Subject: RE: [Taxacom] data quality vs. data security: a survey
>
> Hi Rich,
>
> No, I don't buy it!
>
> >Everytime information about a species, a taxonomic publication
> >citation, etc., etc. is typed by humans on a keyboard (whether it be
> >typed into a manuscrapt, a database, a wikispecies page, or
> wherever),
> >that's duplication of effort. Individually, it seems trivial
> -- but in
> >aggregate it is most certainly *not* trivial
>
> First off, if someone types a citation into a wikispecies
> page, it may in some sense be a duplication of effort if
> someone else has already typed it into something else, or an
> "acronym" or ten have already "harvested" it, but since it
> was typed into wikispecies free of charge, it isn't a SERIOUS
> duplication of effort (on the part of the wikispecies
> contributor). What is a SERIOUS duplication of effort is when
> science funding goes individually to several different
> aggregators to each put the citation in their own particular
> database, and even worse when all they are in fact doing is
> "harvesting" the information from an existing taxon specific
> database. The aggregators are merely parasites ...
>
> >While there is certainly some overlap among them, the
> duplication is by
> >no means "massive".  To say so reveals a poor understanding
> about what
> >these different initiatives actually do
>
> I may not know what they do (behind the scenes), but I know
> what they give the end user, in terms of content, and it just
> isn't very much at all, at least for GBIF, EOL, COL, and the
> like. All they do is "harvest" names and create stubs. I
> don't want a nice looking map of the world on a species page
> if there are no points plotted on it, or if there are so few
> points plotted compared to the actual distribution. How
> "massive" is "massive", in terms of overlap?
>
> >You seem to be confusing "Aggregation" with "Integration".
> Google is
> >an aggregator (an indexer, really -- like GBIF)
>
> OK, so why do we need GBIF, when we already have Google? I am
> NOT, obviously, saying that Google is sufficient for all our
> needs - far from it! I am saying that an expensive entity
> like GBIF is not much better than Google.
>
> This seems to be what is going on: dedicated taxonomists
> (like Bob, for example) work darn hard for relatively little
> reward, creating new taxonomic knowledge. Then, if you are
> lucky, that knowledge gets integrated into either a taxon
> specific database, and/or (if I have anything to do with it)
> Wikispecies. So far, so good. It is what happens next that is
> the problem! Increasing numbers of "parasites" then make far
> more money and have a far easier life than Bob by
> "harvesting" the names from the taxon specific databases, and
> creating skeleton pages on some site that promises so much,
> but never seems to end up delivering much in terms of actual
> content! If you could get actual useful content out of these
> sites, then fine, but all too often you just find a map
> devoid of points, and a page devoid of content!
>
> Cheers,
>
> Stephen
>
> ________________________________________
> From: taxacom-bounces at mailman.nhm.ku.edu
> [taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Richard
> Pyle [deepreef at bishopmuseum.org]
> Sent: Sunday, 14 February 2010 7:47 a.m.
> To: 'TAXACOM'
> Subject: Re: [Taxacom] data quality vs. data security: a survey
>
> Hi Stephen,
>
> > OMG! Did you really just say that! How is a massive duplication of
> > effort increasingly allowing a massive reduction of
> > redundant/duplicate effort????????
>
> It appears you didn't understand my post.  As you say,
> "communication is a very difficult thing, particularly on
> topics as complex as this", so I'll try again.  You seem to
> characterize all the various large-scale data aggregators
> (GBIF, EOL, COL, ALA, etc.) as "massive duplication of effort".
> While there is certainly some overlap among them, the
> duplication is by no means "massive".  To say so reveals a
> poor understanding about what these different initiatives actually do.
>
> Everytime information about a species, a taxonomic
> publication citation, etc., etc. is typed by humans on a
> keyboard (whether it be typed into a manuscrapt, a database,
> a wikispecies page, or wherever), that's duplication of
> effort. Individually, it seems trivial -- but in aggregate it
> is most certainly *not* trivial.
>
> > INTEGRATION is one thing, but MULTIPLE INTEGRATION
> INITIATIVES leading
> > to numerous clone or near clone integrated databases is completely
> > self-defeating!
>
> You seem to be confusing "Aggregation" with "Integration".
> Google is an aggregator (an indexer, really -- like GBIF).
> The DNS system is an architecture for integration.  The
> equivalent of DNS for biodiversity information is what I mean
> by integration.
>
> Aloha,
> Rich
>
>
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with
> either of these methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here



_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of these methods:

(1) http://taxacom.markmail.org

Or (2) a Google search specified as:  site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here



More information about the Taxacom mailing list