[Taxacom] the hurdle for all biodiv informatics initiatives
Dave Roberts
workpackage6 at googlemail.com
Mon Feb 22 03:59:03 CST 2010
On 22 Feb 2010, at 08:58, Vladimir Blagoderov wrote:
> Dear Tony,
>
> On Mon, Feb 22, 2010 at 07:09, Stephen Thorpe
> <s.thorpe at auckland.ac.nz>
> wrote:
>
>> - How do Scratchpads approach the “enter once, use many times”
>> paradigm –
>> i.e., if information is entered or updated in one Scratchpad, can
>> this be
>> propagated automatically to others? (E.g. cf. the WoRMS approach –
>> a sponge
>> expert modifies the World Porifera Database, other DB’s which
>> utilise the
>> same shared master taxon list e.g. ERMS, RAMS are instantly updated)
>>
>
> Data exchange is the problem of all bioinformatics initiative: it is
> either
> copy/paste, which multiplies errors across the Web or mush-up.
> Scratchpads
> development follows accepted standards as much as possible, for
> example
> specimen and locality records are Darwin-core 1.2.1.-compatible. So,
> the
> data can be re-used. Perhaps you suggesting global synchronization
> of all
> taxonomic databases? Unfortunately, we have to be realists.
Part of the 'sweet spot' is just how you allow communities to develop
their own view of a classification system. Scratchpads were conceived
to allow a community to modify and extend their taxonomy and they saw
fit. It is quite reasonable that more than one Scratchpad will cover
a particular taxon (although I know of no example yet). Bottom line:
each Scratchpad has complete control over its taxonomy. No one
outside that community can propagate a change to it without the
community's agreement.
>> - Is it possible to do a single query across multiple Scratchpads,
>> for a
>> taxon or taxon name of interest
>>
>
> Not at the moment, however, most of Scratchpads are taxon-oriented,
> and it
> is possible to have multiple classifications within one Scratchpad.
> You also
> can display information from external sources, for example the other
> Scratchpads, on taxon pages and give it proper credit. I suppose that
> community maintaining the Scratchpad would be interested to collect
> all
> available information in one place.
Of course you can Google. Most Scratchpads live as sub-domains of
myspecies.info, which you can use to restrict the search. I can't
imagine why you'd want to do that though.
These cross-Scratchpad integration tools are an obvious benefit and
are an active area of development for us. They simply haven't got to
the top of the priority pile for our limited development resource. Yet.
>> - Are there scalability issues, i.e. will a Scratchpad break if you
>> try to
>> load e.g. 1 million taxon names, or 10 million into it (as per
>> current uBio
>> content, etc.)
>>
>
> It was tested, and it does work
It does depend on the purpose of loading the names. We have loaded
'life', from CoL, as an hierarchy into the taxonomy module (so that it
can be navigated). That is the test to which Vlad refers. It was
about 2M names. You can't pull uBio into it because uBio's names are
not organised into an hierarchy (they're in multiple hierarchies and
generally stored with only their immediate parent or parents). Indeed
building a taxonomic hierarchy from uBio is a challenge.
>> - Is it possible to obtain a database or tabular dump of up-to-the
>> minute
>> taxonomic information, preferably from all Scratchpads, for local
>> processing
>> or upload to other systems?
>
> Drupal based on mySQL. In theory you can export a particular node,
> selection, or entire database. How to provide this functionality is a
> different question, we are working on it.
This depends on what you mean by "up-to-the minute taxonomic
information". If you mean taxonomic relationships as represented in
the hierarchy, no, not yet. This is on our to-do list.
Otherwise, basically yes, provided you have the permissions to do so.
We, as Scratchpad developers, do not own or have rights to the data.
Individual sites make data available as they see fit.
This is another of those 'sweet spot' compromises. Scratchpad
communities own and manage their data. They are not contributing to a
large data-gathering enterprise, such as EoL or CoL. They are
contributing to a vast knowledge base that is better compared to the
literature. It is just more mobile. This is the compromise. People
still own their data and that ownership is one of the reasons that
they engage with Scratchpads to make their data available.
Cheers, Dave
--
Dr D.McL. Roberts, Tel: +44 (0)20 7942 5086
European Distributed Institute of Taxonomy Project,
Coordinator WorkPackage 6 (Unifying Revisionary Taxonomy),
Dept. Zoology,
The Natural History Museum,
Cromwell Road,
London SW7 5BD
Great Britain Email: dmr at nomencurator.org
Web page: http://scratchpads.eu
Web page: http://www.editwebrevisions.info/
--
"You can't just ask customers what they want and then try and give it
to them. By the time you get it built, they'll want something
new." [Steve Jobs, quoted in The Guardian, Technology Section, 25 June
09].
--
More information about the Taxacom
mailing list