[Taxacom] A question for GBIF regarding data harvests from iNaturalist (Alec McClay)
Stephen Thorpe
stephen_thorpe at yahoo.co.nz
Sun Dec 26 16:21:27 CST 2021
Oops! Replied to wrong email!
On Monday, 27 December 2021, 11:19:13 am NZDT, Stephen Thorpe via Taxacom <taxacom at mailman.nhm.ku.edu> wrote:
It is ok to be wrong about facts, as long as you are prepared to correct your mistakes. It is not ok to have the wrong attitude, one of undermining other people's work, as Hegg has fone
On Sunday, 26 December 2021, 12:18:17 pm NZDT, John Grehan <calabar.john at gmail.com> wrote:
Sometimes there is just help for a situation, especially one involving an independent organization involving 'community' input. Perhaps you should start your own website. I did that for my group so I would have total control. But my former institution ('science' museum) would not keep it going and I could not afford to maintain the website cost after retirement. But there are other cost effective or cost free options for the Web savvy individual.
Cheers, John
On Sat, Dec 25, 2021 at 4:24 PM Stephen Thorpe <stephen_thorpe at yahoo.co.nz> wrote:
John,Presumably it is all about finding an appropriate trade off between data quality and amount of data. If you only addmit the absolutely most reliable (however that can be defined!) data, then you are only going to have a tiny bit of data to work with. Is it better to have 1% of available data, but with 99% reliability, or is it better to have 50% of the available data at 80% reliability? The actual numbers are not relevant here, I am just trying to make a general point. In a way, this point was Danilo Hegg's big mistake: he rolled back "my identifications" (MPI's identifications) of Balta bicolor, because they are not 100% reliable/certain. He was imposing an unrealistically high standard of certainty. Our official government biosecurity authority (MPI) is not infallible, but has a high enough level of reliability for iNat and GBIF purposes. It isn't quite that simple though, as Hegg seemed convinced ("100% certain") that the roaches belong to genus Ellipsidion, so I'm not quite sure why his judgement is seen by him as certain while the judgement of MPI and myself isn't reliable enough for him!Cheers, Stephen
On Sunday, 26 December 2021, 09:33:54 am NZDT, John Grehan <calabar.john at gmail.com> wrote:
All this points to what everyone recognizes - that broad databases are only as good as the data, and some more than others. But in the end one has to make their informed judgment about the data. I have used GBIF at times to get a quick impression of a distribution, but try not to rely on it. So far, my best experiences are with websites on particular groups (e.g. reptiles) where an effort is made to provide distribution maps based on taxonomic works. But nothing is foolproof of course.
John Grehan
On Sat, Dec 25, 2021 at 2:52 PM Kerry Ford via Taxacom <taxacom at mailman.nhm.ku.edu> wrote:
Harvesting all would be scary, just thinking of the grasses and sedges here and my NZ experience of iNat.
Kerry
Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu> on behalf of Stephen Thorpe via Taxacom <taxacom at mailman.nhm.ku.edu>
Sent: Saturday, December 25, 2021 11:37:50 AM
To: taxacom at mailman.nhm.ku.edu <taxacom at mailman.nhm.ku.edu>; Richard Pyle <deepreef at bishopmuseum.org>
Cc: INaturalist Support <help at inaturalist.org>
Subject: Re: [Taxacom] A question for GBIF regarding data harvests from iNaturalist (Alec McClay)
Hi Rich,The issues you raise do have some relevance, but it is a bit more complicated. I could write volumes about all this, but, if I could only maake one short point in this connection, it would be this: I have been repeatedly told over the years by iNat California (specifically Tony Iwane and/or Don Loarie) that everyone in the world must use a single name and classification for every taxon (even if all N.Z. botanists, for example disagree with overseas botanists about NZ taxa!). While this is not directly relevant to multiple identifications, it isn't hard to see how the former might influence the latter! GBIF would actually be better off harvesting all iNat observations, not just "RG" ones. That would certainly mitigate the damage in cases like Balta bicolor.Cheers, Stephen
On Saturday, 25 December 2021, 11:19:30 am NZDT, Richard Pyle via Taxacom <taxacom at mailman.nhm.ku.edu> wrote:
It sounds like a big part of this kerfuffle (would that be the right word to describe it?) involves a current limitation of GBIF (and iNat?) in that only *one* identification per Occurrence instance can be presented/stored/represented. GBIF harvests content through the DarwinCore Archive (DwCA) template, which is essentially a flattened version of the DarwinCore standard. The good news is that the full standard accommodates an unlimited number of taxonomic identifications associated with each Occurrence instance, and the better news is that the good folks at GBIF and TDWG (specifically, the guy who basically invented DwCA and the guy who is widely regarded as the ultimate guru of DarwinCore) are developing a new protocol for how GBIF ingests content, which will be much more effective at leveraging the non-flat capabilities of DarwinCore (watch this space early next year). So, in the not-too-distant future, it should be possible to accommodate multiple alternative identifications for the same Occurrence instance.
In that paradigm, each Occurrence of the relevant roach in NZ could have one identification from Stephen asserting it to be Balta bicolor, and another from Danilo asserting to be [whatever he reckons it is]. And it would be discoverable through searches of either name.
"But...", you might ask, "...how does the 'correct' identification get flagged?" To which I would answer: "There is no such thing."
Aloha,
Rich
Richard L. Pyle, PhD
Senior Curator of Ichthyology | Director of XCoRE
Bernice Pauahi Bishop Museum
1525 Bernice Street, Honolulu, HI 96817-2704
Office: (808) 848-4115; Fax: (808) 847-8252
eMail: deepreef at bishopmuseum.org
BishopMuseum.org
Our Mission: Bishop Museum inspires our community and visitors through the exploration and celebration of the extraordinary history, culture, and environment of Hawaiʻi and the Pacific.
> -----Original Message-----
> From: Taxacom <taxacom-bounces at mailman.nhm.ku.edu> On Behalf Of
> Stephen Thorpe via Taxacom
> Sent: Friday, December 24, 2021 10:27 AM
> To: taxacom at mailman.nhm.ku.edu; Mark Egger <m.egger at comcast.net>
> Subject: Re: [Taxacom] A question for GBIF regarding data harvests from
> iNaturalist (Alec McClay)
>
> Hi Mark,You are correct, but please be aware than in cases of misidentified RG
> observations, they will drop back out of GBIF if the misidentification is
> subsequently corrected on iNat. This can be seen as an improvement in data
> quality over time. What I am talking about, however, is the reverse. GBIF did
> have (maybe still does have, for the moment) a solid portfolio of RG Balta
> bicolor observations, sourced from iNat, and due largely to my work. Danilo
> Hegg, however, unilaterally decided that both I and our official government
> biosecurity authority (MPI) were wrong, and so rolled back all the IDs, without
> any prior discussion. He was not even aware of the relevant facts when he did
> this! He interpreted those facts, when I informed him of them, in a highly
> biased way, having already made up his mind on the matter. Nobody else on
> iNat has the relevant knowledge of the taxa concerned, to make a meaningful
> judgement. It really comes down to whether or not I was justified in following
> the ID by MPI, and I suggest this is a perfectly reasonable thing to do, in the
> absence of any convincing evidence to the contrary (and merely looking
> superficially like a roach in the genus Ellipsidion is not convincing evidence!) So,
> here we have a problem in which good, solid data, already on GBIF, can be
> removed by the actions of a single stubborn and misguided iNat user. This does
> not lead to an improvement in data quality over time, quite the
> reverse!Cheers, Stephen
> On Saturday, 25 December 2021, 09:11:10 am NZDT, Mark Egger via
> Taxacom <taxacom at mailman.nhm.ku.edu> wrote:
>
> As a “curator” for a few genera of flowering plants of iNaturalist, I would urge
> great caution in assessing the reliability of “data harvests” from the site,
> even/especially for “Research Grade” observations. In the genera I monitor, I
> very frequently come across RG identifications that are clearly incorrect. This
> stems from the fact that one inexperienced user can “confirm” a machine-
> generated suggestion, and then this faulty identification can be confirmed by
> an equally inexperienced observer, often a friend of the original poster. That’s
> all that is needed to make an observation RG! I’ve seen cases where an
> obviously incorrect identification has been confirmed by as many as 4 or 5
> users. Of course, herbarium specimens are subject to incorrect identifications
> as well and still end up in public databases, but the sheer number of
> observations coming in daily to iNaturalist makes it particularly subject to
> these sorts of errors, especially when a certain percentage of the posters don’t
> seem able to distinguish a red fall leaf from a red-colored inflorescence. While
> curators find and correct many such cases, many more undoubtedly escape
> timely detection.
>
> That being said, iNaturalist is a wonderful tool for collecting observations from
> regions that are understudied, and real discoveries can be made there. For
> instance, I’ve seen posts of what are clearly undescribed or little-known
> species, especially from Mexico. But the fact remains that bulk data imports
> from iNat should be analyzed carefully prior to using them, especially for
> distributional studies. And it is also vital to understand that the label of RG
> does not at all mean that the identification has been made by experts in the
> given group. Sometimes this is the case, but in many others it is definitely not.
>
> Mark
>
> > On Dec 24, 2021, at 10:00 AM, taxacom-request at mailman.nhm.ku.edu
> wrote:
> >
> > Daily News from the Taxacom Mailing List
> >
> > When responding to a message, please do not copy the entire digest into
> your reply.
> > ____________________________________
> >
> >
> > Today's Topics:
> >
> > 1. Re: A question for GBIF regarding data harvests from
> > iNaturalist (Alec McClay)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Fri, 24 Dec 2021 12:17:47 -0500
> > From: Alec McClay <alec.mcclay at shaw.ca>
> > To: taxacom at mailman.nhm.ku.edu
> > Subject: Re: [Taxacom] A question for GBIF regarding data harvests
> > from iNaturalist
> > Message-ID: <eccd6f76-0c80-8ea0-8ce2-6a09189a75b3 at shaw.ca>
> > Content-Type: text/plain; charset=UTF-8; format=flowed
> >
> > According to this discussion on the iNaturalist forum
> > https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fforum.inaturalist.org%2Ft%2Fobservations-of-cultivated-plants-on-g&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955406554%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=8izqVncvv%2BMR1x1B4R3TkwxYSK5eBlZWOo3HfuCUQ6E%3D&reserved=0
> > bif/5296/4 (from someone who works for GBIF) "For the record, the good
> > iNat folks control what goes in the dataset that we ingest. If they
> > add/update/delete a record, we do the same when ingesting".
> >
> > Merry Christmas and Happy New Year to all.
> >
> > Alec.
> >
> > On 2021-12-23 1:00 p.m., taxacom-request at mailman.nhm.ku.edu wrote:
> >> Date: Wed, 22 Dec 2021 20:35:02 +0000 (UTC)
> >> From: Stephen Thorpe<stephen_thorpe at yahoo.co.nz>
> >> To:"jmiller at gbif.org" <jmiller at gbif.org>, Taxacom
> >> <taxacom at mailman.nhm.ku.edu>
> >> Subject: [Taxacom] A question for GBIF regarding data harvests from
> >> iNaturalist
> >> Message-ID:<1223057220.1448512.1640205302753 at mail.yahoo.com>
> >> Content-Type: text/plain; charset=UTF-8
> >>
> >> Hi Joe,
> >> As you know, GBIF periodically harvests Research Grade observations from
> iNaturalist. What isn't quite clear, but which I think would be well worth
> clarifying, if you could, please, is what happens to observations which drop
> back out of Research Grade? Do they drop out of GBIF at the next harvest? This
> is important for the reason that there are two types of cases, and the
> consequences are very different for observations of each type: (1) observations
> of well-known species; and (2) observations reliant on expert IDs.
> >> For type (1) observations, it can be reasonably assumed that dropping back
> out of RG will rarely happen, and if it does happen for inadequate reasons,
> then the community ID will be restored fairly quickly, since it involves a well-
> known species that many iNat users are familiar with.
> >> For type (2) observations, however, IDs may be based on just a couple of
> experts. An RG observation of this kind can be dropped out of RG by any iNat
> user, who chooses to disagree for whatever reason, be it scientific or personal
> or whatever. The lack of further experts means that RG is likely not to be able
> to be restored very easily!
> >> So, my question is, for type (2) observations that were RG long enough to
> have been harvested by GBIF, if they subsequently drop out of RG on iNat, do
> they drop out of GBIF at the next data harvest? If so, then data already in GBIF,
> harvested from iNat, is vulnerable to the whims of single users on iNat, which,
> to my mind at least, is a concern!
> >> Cheers, Stephen
> >
> > --
> > Alec McClay
> > 12 Roseglen Private
> > Ottawa, ON K1H 1B6
> > Canada
> > 613-739-8499 (home)
> > 343-988-4077 (mobile)
> >
> >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > Taxacom Mailing List
> >
> > Send Taxacom mailing list submissions to taxacom at mailman.nhm.ku.edu
> > For list information; to subscribe or unsubscribe, visit:
> > https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FEIDNxhiOtuesZgYEuvRptgUYM%2FpX2yyb2G2g%2BO2Ncc%3D&reserved=0
> > You can reach the person managing the list at:
> > taxacom-owner at mailman.nhm.ku.edu The Taxacom email archive back to
> > 1992 can be searched at: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org%2F&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nh4p%2BHOT689wndtIpxw7tmZf2mbiWqbKAw93ilNq3h4%3D&reserved=0
> >
> > Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
> >
> >
> > ------------------------------
> >
> > End of Taxacom Digest, Vol 188, Issue 18
> > ****************************************
>
> _______________________________________________
> Taxacom Mailing List
>
> Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu For
> list information; to subscribe or unsubscribe, visit:
> https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FEIDNxhiOtuesZgYEuvRptgUYM%2FpX2yyb2G2g%2BO2Ncc%3D&reserved=0
> You can reach the person managing the list at: taxacom-
> owner at mailman.nhm.ku.edu The Taxacom email archive back to 1992 can be
> searched at: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org%2F&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nh4p%2BHOT689wndtIpxw7tmZf2mbiWqbKAw93ilNq3h4%3D&reserved=0
>
> Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
>
> _______________________________________________
> Taxacom Mailing List
>
> Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu For
> list information; to subscribe or unsubscribe, visit:
> https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FEIDNxhiOtuesZgYEuvRptgUYM%2FpX2yyb2G2g%2BO2Ncc%3D&reserved=0
> You can reach the person managing the list at: taxacom-
> owner at mailman.nhm.ku.edu The Taxacom email archive back to 1992 can be
> searched at: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org%2F&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nh4p%2BHOT689wndtIpxw7tmZf2mbiWqbKAw93ilNq3h4%3D&reserved=0
>
> Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
_______________________________________________
Taxacom Mailing List
Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu
For list information; to subscribe or unsubscribe, visit: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FEIDNxhiOtuesZgYEuvRptgUYM%2FpX2yyb2G2g%2BO2Ncc%3D&reserved=0
You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu
The Taxacom email archive back to 1992 can be searched at: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org%2F&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nh4p%2BHOT689wndtIpxw7tmZf2mbiWqbKAw93ilNq3h4%3D&reserved=0
Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
_______________________________________________
Taxacom Mailing List
Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu
For list information; to subscribe or unsubscribe, visit: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.nhm.ku.edu%2Fcgi-bin%2Fmailman%2Flistinfo%2Ftaxacom&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=FEIDNxhiOtuesZgYEuvRptgUYM%2FpX2yyb2G2g%2BO2Ncc%3D&reserved=0
You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu
The Taxacom email archive back to 1992 can be searched at: https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftaxacom.markmail.org%2F&data=04%7C01%7Cfordk%40landcareresearch.co.nz%7C4dfd9ae9c2714f0495dd08d9c72e1127%7C43050530b3c74cd2a11cb826b2604b5b%7C0%7C0%7C637759822955416549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=nh4p%2BHOT689wndtIpxw7tmZf2mbiWqbKAw93ilNq3h4%3D&reserved=0
Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
________________________________
Please consider the environment before printing this email
Warning: This electronic message together with any attachments is confidential. If you receive it in error: (i) you must not read, use, disclose, copy or retain it; (ii) please contact the sender immediately by reply email and then delete the emails.
The views expressed in this email may not be those of Landcare Research New Zealand Limited. http://www.landcareresearch.co.nz
_______________________________________________
Taxacom Mailing List
Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu
For list information; to subscribe or unsubscribe, visit: http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu
The Taxacom email archive back to 1992 can be searched at: http://taxacom.markmail.org
Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
_______________________________________________
Taxacom Mailing List
Send Taxacom mailing list submissions to: taxacom at mailman.nhm.ku.edu
For list information; to subscribe or unsubscribe, visit: http://mailman.nhm.ku.edu/cgi-bin/mailman/listinfo/taxacom
You can reach the person managing the list at: taxacom-owner at mailman.nhm.ku.edu
The Taxacom email archive back to 1992 can be searched at: http://taxacom.markmail.org
Nurturing nuance while assailing ambiguity for about 34 years, 1987-2021.
More information about the Taxacom
mailing list