[Taxacom] Wikispecies is not a database: part 3 (after thinking about it!)

Mike Sadka M.Sadka at nhm.ac.uk
Sat Aug 8 09:46:38 CDT 2009


Hi Taxacomers.

 

I've never posted to Taxacom before - I am a techie and usually just observe the learned debate <humour>and sometimes wonder how some of you manage to get anything else done!</humour>

 

But I can't resist some comments on this thread.

 

>Wikispecies ... cannot do some (important?) things that databases can do

 

I'm surprised that anyone would question the importance of being able to query a back end data store.  What is data for if not to answer questions?  

 

 

> (1) Is Wikispecies a database?
> I now think so again! I don't see any good reason to adopt Rod Page's
> overly narrow concept of a database, but instead see more sense in
> Tony Rees' broader concept (as per his Wikipedia article), into which
> he was (at least initially) willing to include Wikispecies.

 

The term "database" has already been defined by the appropriate discipline.

 

>From Wikipedia: "... an integrated collection of logically related records or files which consolidates records previously stored in separate files into a common pool of data records that provides data for many applications.  ..."  [my empahsis] 

 

Rod Page's concept isn't narrow - it is correct.  And I disagree with Rod only in that I think it does matter what you call it.  

 

I believe it is vital to distinguish between databases and the applications that allow users to interact with them.  Databases are simply storage for data.  In almost all cases, users access data via the intermediary of an application.  A wiki is just one type of application a database might support.

 

Wikispecies is an instance of the MediaWiki application, which uses a database to store the data it presents to the user.  In principle that database is queryable just like any other - but the MediaWiki application interface does not expose that functionality to users.

 

So Wiki vs Database is a false and very misleading debate.  (As Richard Pyle said, this thread has arguably been about the differences between closed- or open-access databases - regardless of what kind of application is used to populate them.)  What is needed is a data model for taxonomic information that can support all sorts of applications, including wikis (as others including Jim Croft have said).  

 

All this may sound unnecessarily pedantic, but IT is no different from systematics in that respect!  If one doesn't use the terminology correctly one runs the risk of talking from a dubious orifice... 

 

In my opinion (for what it's worth) I think many of the sensible things that have been suggested in this and related discussions (eg, inter alia - better flow of data between grass-roots databases and large aggregators) are not achievable until there are robust standards for storing and manipulating taxonomic data.  

 

I would also suggest that this is less my opinion and more a statement of technical reality.  All IT applications that can readily exchange data need common data standards in order to do so.  

 

Development of standards is arduous, but once standards and protocols are in place, applications can proliferate - just look what the HTTP and IP protocols with HTML and other web technology standards have done for the web in just a few years.

 

I would totally sympathise with anyone who groans at my mention of "standards" - but I don't see any getting away from that in the end.  So rather than numerous competing high-level money-sapping aggregation projects, it would be better (if harder to fund) to put resources into developing such standards.  

 

 

>I just do think you [Rod] are nitpicking just a wee bit on Wikispecies' weaknesses, rather than giving due credit to its strengths

 

Maybe - but conversely I suspect you maybe do not appreciate how significant the weaknesses are.

 

The strengths are good I agree, but those weaknesses are critical, and mean that wikispecies fails to exploit the full potential of the digital medium.  Without the ability to search across pages, wikispecies is more like a paper book that a proper digital publication (as someone else was driving at).  

 

But that doesn't mean ditch it - to me it means extend the interface to include query (and other) capabilities - or use other tools to do that on the same back-end datasource).  That said, effective querying does depend on an effective underlying data model - which brings us back to standards again - sorry!

 

 

>(6) the 3 most important things about any kind of taxonomic database
> are data quality, data quality, and (you guessed it) data quality!

 

Absolutely!  And not just taxonomic databases - data quality is always important, and that is exactly what databases are for and good at.  If you want to store a lot of data you need a database.  You can (and probably should) build a wiki on top - but a wiki won't store or protect your data.

 

 

Cheerio, Mike

 

 





More information about the Taxacom mailing list