[Taxacom] [EXT] Basis of Record phrases

Mary Barkworth Mary.Barkworth at usu.edu
Fri Jul 23 14:04:49 CDT 2021


John, thank you for the background as well as the detailed explanation. I shall revise what I have written. I still think they are rather uninformative terms, but it helps to understand how they came into being. HumanObservation would be a report, with no supporting specimens or images. I agree that my suggestion could easily lead to too many categories, not to mention endless argument.

Mary


From: John Wieczorek <tuco at berkeley.edu>
Sent: Friday, July 23, 2021 12:52 PM
To: Mary Barkworth <Mary.Barkworth at usu.edu>
Cc: (Taxacom at mailman.nhm.ku.edu) <taxacom at mailman.nhm.ku.edu>
Subject: Re: [EXT] [Taxacom] Basis of Record phrases

Hi Mary,

When it came into existence (in preparation for the first version of Darwin Core as a TDWG standard), the basisOfRecord term was meant to allow records to be identified as being primarily "about" a particular Darwin Core class of information (a Taxon, a Location, an Occurrence, etc.). The primary goals at that time were to share records about Taxa and records about Occurrences. Among the Occurrence stakeholders, there were the voucher people and the observation people, and both wanted to know which were "their" kind of records. Thus the subsets of Occurrences came into play (PreservedSpecimen, LivingSpecimen, FossilSpecimen; HumanObservation, MachineObservation). Periodically people wanted to go even further with basisOfRecord, to distinguish between ever more refined perspectives as categories of what a record is primarily about. There is a cost for that. Divide it up more and it becomes more challenging to make those primary original distinctions, which goes against one of the principles of Darwin Core - stability.

HerbariumSpecimen never came into existence as a recommended basisOfRecord value, though it was discussed, and as of January of this year it was only used for about 110k Occurrences records shared via GBIF from among 127M records about specimens of one kind or another. All but two of those records are from a single institution.

Darwin Core deals with the various distinctions you'd like to see in other ways than the basisOfRecord term. For media-based records, for example, Darwin Core adopts the Dublin Core vocabulary StillImage, MovingImage, and Sound. These don't go in basisOfRecord (though they did at one time), because they are formal vocabulary for a Dublin Core term "dc:type", also adopted by Darwin Core. Whenever you have one of these media values for "dc:type", the Darwin Core basisOfRecord needs to be "MachineObservation" because the evidence was recorded with a machine. The correct value of dc:type for all of the specimen basisOfRecord values is "PhysicalObject". To round out the topic, publications should be encoded with dc:type = "Text" and basisOfRecord = "MaterialCitation". That last one was just ratified a week ago as a new term (https://github.com/tdwg/dwc/issues/329) in Darwin Core and will appear on shelves in the next release of the standard. :-)

It sounds like you might have some use cases that appear intractable the way things are right now. One of those appears to be finding records "about" herbarium specimens. If so, what does the following not cover for that case? kingdom="Plantae" and basisOfRecord = "PreservedSpecimen"

To answer the questions about the review of examples for the terms in Darwin Core, that can be done at any time by submitting a term change request in the Darwin Core GitHub repository. An example change doesn't necessarily even need to go through public review, if it simply adds clarity and doesn't affect the semantics of the term.

Finally, I have included the top 100 values published through GBIF for records that GBIF is able to interpret unambiguously as the recommended value "PreservedSpecimen". You can see that, though there is diversity in how they are expressed exactly, the concepts are remarkably harmonious and in agreement with the recommendations in the usage comments and examples for the basisOfRecord term.

I hope this helps,

John

Published basisOfRecord
No. of Occurrences
PRESERVEDSPECIMEN
170205764
SPECIMEN
5407188
S
1507611
PRESERVED SPECIMEN
1494656
ACCESSION
976457
SPECIMEN(SP)
966557
VOUCHER
535598
HERBARIUM SPECIMEN
111531
PRESERVED
109332
MUSEUM SPECIMEN
95125
ESPECIMEN
48387
SP
32374
ESPÈCE
13162
HERBARIUM SHEET
8312
PRESERVED_SPECIMEN
6114
PRESERVADO
5317
1 SPECIMEN
4821
PRESERVEDRECORD
3945
PRESERVED IN ALCOHOL
3298
2 SPECIMENS
1168
3 SPECIMENS
610
SHEET
370
4 SPECIMENS
367
PAPERED SPECIMEN
329
5 SPECIMENS
326
LABEL 2
262
LABEL 1
260
ADULT, PAPERED
230
VOUCHER REARED
218
6 SPECIMENS
210
DEAD SEED SAMPLE
199
SPÉCIMEN
174
ADULT, MOUNTED
166
LABEL 3
160
7 SPECIMENS
152
8 SPECIMENS
139
10 SPECIMENS
133
9 SPECIMENS
119
LABEL
110
11 SPECIMENS
70
MOUNTED SPECIMEN
68
12 SPECIMENS
66
LABEL 4
56
HERBARIUMSPECIMEN
56
14 SPECIMENS
54
13 SPECIMENS
52
15 SPECIMENS
50
PS
37
18 SPECIMENS
36
17 SPECIMENS
36
16 SPECIMENS
35
19 SPECIMENS
30
20 SPECIMENS
29
OTHERSPECIMEN
29
22 SPECIMENS
25
21 SPECIMENS
23
25 SPECIMENS
22
30 SPECIMENS
21
23 SPECIMENS
19
26 SPECIMENS
17
LABEL 5
14
DRY SAMPLE
14
27 SPECIMENS
13
40 SPECIMENS
13
24 SPECIMENS
13
36 SPECIMENS
11
29 SPECIMENS
11
35 SPECIMENS
9
28 SPECIMENS
8
34 SPECIMENS
8
32 SPECIMENS
8
50 SPECIMENS
8
45 SPECIMENS
7
33 SPECIMENS
7
31 SPECIMENS
7
59 SPECIMENS
7
44 SPECIMENS
6
42 SPECIMENS
6
43 SPECIMENS
6
PRESERVEDSPECIMEN
6
54 SPECIMENS
6
37 SPECIMENS
5
52 SPECIMENS
5
46 SPECIMENS
5
67 SPECIMENS
5
100 SPECIMENS
4
48 SPECIMENS
4
61 SPECIMENS
4
69 SPECIMENS
3
56 SPECIMENS
3
39 SPECIMENS
3
74 SPECIMENS
3
81 SPECIMENS
3
41 SPECIMENS
3
EMERGED SPECIMEN
3
38 SPECIMENS
3
65 SPECIMENS
2
83 SPECIMENS
2
DRIED SPECIMEN
2
85 SPECIMENS
2

On Fri, Jul 23, 2021 at 8:14 AM Mary Barkworth <Mary.Barkworth at usu.edu<mailto:Mary.Barkworth at usu.edu>> wrote:
The email I just posted was intended for John Wieczorek, but I mistook the draft I was completing. Since it has gone to Taxacom, let me ask 1) what values are being used in the BasisofRecord field and 2) are the values based on a different set of standards such as those in Audubon Core or Dublin Core?

Mary

-----Original Message-----
From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu<mailto:taxacom-bounces at mailman.nhm.ku.edu>> On Behalf Of Mary Barkworth via Taxacom
Sent: Friday, July 23, 2021 5:03 AM
To: (Taxacom at mailman.nhm.ku.edu<mailto:Taxacom at mailman.nhm.ku.edu>) <taxacom at mailman.nhm.ku.edu<mailto:taxacom at mailman.nhm.ku.edu>>
Subject: [EXT] [Taxacom] Basis of Record phrases

I am curious as to the terms being used for this field in collections.  There are a few examples given in the TDWG reference guide. Long ago, I was told that herbarium specimens should be listed as "preserved specimens." In the interests of standardization, I did so, but really wish a more useful list were given.  Example I would like to see are "still image", "video", "audio", "publication", "herbariumspecimen", "liquidpreserved."  Are the examples ever reviewed?

I am going to ask GBIF about values in submitted records, but I am curious about review of examples in the quick reference guide.

Mary
_______________________________________________
Taxacom Mailing List

Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu<mailto:taxacom at mailman.nhm.ku.edu> For list information; to subscribe or unsubscribe, visit: http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu<mailto:taxacom-owner at mailman.nhm.ku.edu> The Taxacom email archive back to 1992 can be searched at: http://taxacom.markmail.org

Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
CAUTION: This email originated from outside of USU. If this appears to be a USU employee, beware of impersonators. Do not click links, reply, download images, or open attachments unless you verify the sender’s identity and know the content is safe.


More information about the Taxacom mailing list