[Taxacom] A question for GBIF regarding data harvests from iNaturalist (Alec McClay)

Mark Egger m.egger at comcast.net
Fri Dec 24 14:10:35 CST 2021


As a “curator” for a few genera of flowering plants of iNaturalist, I would urge great caution in assessing the reliability of “data harvests” from the site, even/especially for “Research Grade” observations. In the genera I monitor, I very frequently come across RG identifications that are clearly incorrect. This stems from the fact that one inexperienced user can “confirm” a machine-generated suggestion, and then this faulty identification can be confirmed by an equally inexperienced observer, often a friend of the original poster. That’s all that is needed to make an observation RG! I’ve seen cases where an obviously incorrect identification has been confirmed by as many as 4 or 5 users. Of course, herbarium specimens are subject to incorrect identifications as well and still end up in public databases, but the sheer number of observations coming  in daily to iNaturalist makes it particularly subject to these sorts of errors, especially when a certain percentage of the posters don’t seem able to distinguish a red fall leaf from a red-colored inflorescence. While curators find and correct many such cases, many more undoubtedly escape timely detection.

That being said, iNaturalist is a wonderful tool for collecting observations from regions that are understudied, and real discoveries can be made there. For instance, I’ve seen posts of what are clearly undescribed or little-known species, especially from Mexico. But the fact remains that bulk data imports from iNat should be analyzed carefully prior to using them, especially for distributional studies. And it is also vital to understand that the label of RG does not at all mean that the identification has been made by experts in the given group. Sometimes this is the case, but in many others it is definitely not.

Mark

> On Dec 24, 2021, at 10:00 AM, taxacom-request at mailman.nhm.ku.edu wrote:
> 
> Daily News from the Taxacom Mailing List 
> 
> When responding to a message, please do not copy the entire digest into your reply.
> ____________________________________
> 
> 
> Today's Topics:
> 
>   1. Re: A question for GBIF regarding data harvests from
>      iNaturalist (Alec McClay)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 24 Dec 2021 12:17:47 -0500
> From: Alec McClay <alec.mcclay at shaw.ca>
> To: taxacom at mailman.nhm.ku.edu
> Subject: Re: [Taxacom] A question for GBIF regarding data harvests
> 	from	iNaturalist
> Message-ID: <eccd6f76-0c80-8ea0-8ce2-6a09189a75b3 at shaw.ca>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> According to this discussion on the iNaturalist forum 
> https://forum.inaturalist.org/t/observations-of-cultivated-plants-on-gbif/5296/4 
> (from someone who works for GBIF) "For the record, the good iNat folks 
> control what goes in the dataset that we ingest. If they 
> add/update/delete a record, we do the same when ingesting".
> 
> Merry Christmas and Happy New Year to all.
> 
> Alec.
> 
> On 2021-12-23 1:00 p.m., taxacom-request at mailman.nhm.ku.edu wrote:
>> Date: Wed, 22 Dec 2021 20:35:02 +0000 (UTC)
>> From: Stephen Thorpe<stephen_thorpe at yahoo.co.nz>
>> To:"jmiller at gbif.org"  <jmiller at gbif.org>,  Taxacom
>> 	<taxacom at mailman.nhm.ku.edu>
>> Subject: [Taxacom] A question for GBIF regarding data harvests from
>> 	iNaturalist
>> Message-ID:<1223057220.1448512.1640205302753 at mail.yahoo.com>
>> Content-Type: text/plain; charset=UTF-8
>> 
>> Hi Joe,
>> As you know, GBIF periodically harvests Research Grade observations from iNaturalist. What isn't quite clear, but which I think would be well worth clarifying, if you could, please, is what happens to observations which drop back out of Research Grade? Do they drop out of GBIF at the next harvest? This is important for the reason that there are two types of cases, and the consequences are very different for observations of each type: (1) observations of well-known species; and (2) observations reliant on expert IDs.
>> For type (1) observations, it can be reasonably assumed that dropping back out of RG will rarely happen, and if it does happen for inadequate reasons, then the community ID will be restored fairly quickly, since it involves a well-known species that many iNat users are familiar with.
>> For type (2) observations, however, IDs may be based on just a couple of experts. An RG observation of this kind can be dropped out of RG by any iNat user, who chooses to disagree for whatever reason, be it scientific or personal or whatever. The lack of further experts means that RG is likely not to be able to be restored very easily!
>> So, my question is, for type (2) observations that were RG long enough to have been harvested by GBIF, if they subsequently drop out of RG on iNat, do they drop out of GBIF at the next data harvest? If so, then data already in GBIF, harvested from iNat, is vulnerable to the whims of single users on iNat, which, to my mind at least, is a concern!
>> Cheers, Stephen
> 
> -- 
> Alec McClay
> 12 Roseglen Private
> Ottawa, ON K1H 1B6
> Canada
> 613-739-8499 (home)
> 343-988-4077 (mobile)
> 
> 
> 
> ------------------------------
> 
> Subject: Digest Footer
> 
> Taxacom Mailing List
> 
> Send Taxacom mailing list submissions to taxacom at mailman.nhm.ku.edu
> For list information; to subscribe or unsubscribe, visit: http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
> You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu
> The Taxacom email archive back to 1992 can be searched at: http://taxacom.markmail.org
> 
> Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
> 
> 
> ------------------------------
> 
> End of Taxacom Digest, Vol 188, Issue 18
> ****************************************



More information about the Taxacom mailing list