[Taxacom] Wikispecies is not a database: part 3 (after thinking about it!)

Mike Sadka M.Sadka at nhm.ac.uk
Thu Aug 13 05:19:47 CDT 2009


Just one final point:

>> We should prioritise data quality over everything else.

That is exactly why you need properly structured databases (in the
correct sense) - they store and protect data.  If you disagree look at
any book on relational databases.

>> impressively structured and presented websites ("databases" in the
broad sense)

This is an incorrect use of the term "database" - not a broad one.
Websites, however well structured or presented, are not and cannot
replace databases.  A website is just how the data are presented and
tells you absolutely nothing about how safely or effectively they are
stored.  This is basic IT.

OK - two final points.  But more than enough from me now I am sure!





-----Original Message-----
From: taxacom-bounces at mailman.nhm.ku.edu
[mailto:taxacom-bounces at mailman.nhm.ku.edu] On Behalf Of Stephen Thorpe
Sent: 11 August 2009 02:40
To: taxacom at mailman.nhm.ku.edu
Subject: Re: [Taxacom] Wikispecies is not a database: part 3 (after
thinking about it!)

Hi Mike

I don't know if it is just me, but I find it quite difficult in a  
forum like this to get the details of my argument right first time,  
but then the responses kind of prod me in the right direction, so I  
guess it works out good in the end. Anyway,

[you wrote] This attitude worries me a lot because it seems not to ask  
where that taxonomic information is going or whether best use is made  
of it, and I feel it sells the data short.
  Where does "just ... typing in taxonomic information" get you?   
Without an underlying standard, typing just makes more pages of  
taxonomic information.  They may be very useful but they might as well  
be on paper except that they are easier to update and distribute.
  ICT has the potential to search, sort, aggregate and integrate data  
from a range of sources - thereby generating new information, and  
giving those "experienced and knowledgeable" people more and novel  
opportunities to make discoveries - not to replace fieldwork (or  
closetwork), but to maximise the information derived from the data it  
generates.   If data are entered willy-nilly into numerous different  
systems without care for their fate, their usefulness is limited and  
maybe shortlived.
  Obviously this isn't an argument against wikispecies, or for  
numerous different OLs.  It's an argument for both using common  
standards.

OK, you have made a bit of a straw man here! I am certainly not  
proposing that all we need to do is just type taxonomic information  
willy nilly on to the web! In fact, it is this very thing which has  
caused many of the current problems, because no two web sources seem  
to agree very often, and so the "willinilliness" has resulted in utter  
chaos! On that I think we can agree, hopefully!

My point is this: we need to prioritise things somewhat. We should  
prioritise data quality over everything else. There is no point  
developing flash databases if you don't know where you are going to  
get hold of good data. Some of the examples I have given in previous  
emails show relatively impressively structured and presented websites  
("databases" in the broad sense) giving poor quality outputs.

Wikispecies already exists as a REASONABLY adequate infrastructure  
upon which to create a solid pool of taxonomic information, which can  
THEN be used as a solid DATA PROVIDER for any number of other database  
initiatives. I just think that more effort at this early stage should  
go into making that source of information comprehensive (and  
verifiable, by way of full referencing).

The attitude that worries me a lot is "build your database first, and  
worry about where the data is going to come from second (if the  
funding doesn't run out first!)". Another related line that I have  
heard goes something like "well, I know there are going to be issues  
about who to believe when data providers come into conflict, but the  
funding for the first year is just to get the infrastructure up and  
running, so we will just have to sort that problem out somehow later  
on down the track" ...

The reality is that many of the people currently in charge of online  
"databases" obsess too much about presentation, and what it COULD do  
(if it had the data!). AFD recently went to all the trouble of  
changing its user interface from something that was fine to something  
rather less than fine, and still they haven't managed to get even that  
1999 Apteropanorpa name into their system!

Cheers,

Stephen




Quoting Mike Sadka <M.Sadka at nhm.ac.uk>:

>
> Hi Stephen
>
> And bravo Evgeniy !
>
> [You wrote...]
> Everybody won't just adopt them [standards] - we are even constantly  
> having to defend ourselves against factions who want rid of  
> traditional biological nomenclature altogether!
>
> But technological standards are not for "everybody" - they are for  
> machines. Taxonomists don't need even to know that they exist, but  
> machines will not be able to serve taxonomy to their full potential  
> without them.
>
> [and...]
> ... instead of just sitting down at a computer online and typing in  
> taxonomic information,...
>
> This attitude worries me a lot because it seems not to ask where  
> that taxonomic information is going or whether best use is made of  
> it, and I feel it sells the data short.
>
> Where does "just ... typing in taxonomic information" get you?   
> Without an underlying standard, typing just makes more pages of  
> taxonomic information.  They may be very useful but they might as  
> well be on paper except that they are easier to update and distribute.
>
> ICT has the potential to search, sort, aggregate and integrate data  
> from a range of sources - thereby generating new information, and  
> giving those "experienced and knowledgeable" people more and novel  
> opportunities to make discoveries - not to replace fieldwork (or  
> closetwork), but to maximise the information derived from the data  
> it generates.   If data are entered willy-nilly into numerous  
> different systems without care for their fate, their usefulness is  
> limited and maybe shortlived.
>
> Obviously this isn't an argument against wikispecies, or for  
> numerous different OLs.  It's an argument for both using common  
> standards.
>
>
> [And...]
> ...so we don't need working taxonomists to help build our databases...
>
> I agree - there's been quite enough of that already!   You need IT  
> people to buld your databases.  ;-)
>
> Cheerio, Mike
>
>
>
> _______________________________________________
>
> Taxacom Mailing List
> Taxacom at mailman.nhm.ku.edu
> http://mailman.nhm.ku.edu/mailman/listinfo/taxacom
>
> The Taxacom archive going back to 1992 may be searched with either  
> of these methods:
>
> (1) http://taxacom.markmail.org
>
> Or (2) a Google search specified as:   
> site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


_______________________________________________

Taxacom Mailing List
Taxacom at mailman.nhm.ku.edu
http://mailman.nhm.ku.edu/mailman/listinfo/taxacom

The Taxacom archive going back to 1992 may be searched with either of
these methods:

(1) http://taxacom.markmail.org

Or (2) a Google search specified as:
site:mailman.nhm.ku.edu/pipermail/taxacom  your search terms here




More information about the Taxacom mailing list