Taxacom: a class of errors in Worms (and similar databases)
Erikjan Rijkers
er at xs4all.nl
Sun Feb 23 23:50:14 CST 2025
Op 2/24/25 om 00:06 schreef Geoff Read:
> perhaps not a good example of the general situation in a database
Perhaps - but it's not exactly an accidental error:
Of the names in the GBIF backbone file (admittedly from 202308) which
has 2,614,593 accepted species names (in 268,274 genera), there is more
than 1 % erroneously unparenthesized.
From GBIF name records that have status='ACCEPTED':
[...]
Zygopleura Koken, 1892 | Zygopleura plebia Herrick, 1887
Zygosoma Labbé, 1899 | Zygosoma gibbosum Greeff, 1880
Zygota Förster, 1856 | Zygota congener Zetterstedt, 1840
Zynodes Whalley, 1970 | Zynodes strigerella Hampson, 1903
Zyzzyva T.L.Casey, 1922 | Zyzzyva squamosa C.H.Boheman, 1844
(28836 rows)
The 'accepted' names from checklistbank.org (from 202502) are 0.5%
erroneously unparenthesized (~13,000 names).
>
> In WoRMS I think a general search for instances of lack of parenthesis where there is a younger genus name requires the WoRMS database managers to do the search. Editors & users don't have the complex search capability to find the mismatches.
>
> So, as Mark Costello suggested, an approach to the WoRMS data team to investigate for other instances would be a great idea.
Surely the biologists/curators (who I would expect might be on TAXACOM)
should instruct their own technical people.
Erikjan
More information about the Taxacom
mailing list