[Taxacom] Propagation of bad sameAs statements
joel sachs
jsachs at csee.umbc.edu
Wed Sep 8 09:42:45 CDT 2010
I'd like to catalog sources of biodiversity information and misinformation
on the semantic web, and am trying to determine the genesis of some
unfortunate owl:sameAs statements.
According to sameas.org:
<http://dbpedia.org/resource/Invasive_species>
<owl:sameAs>
<http://dbpedia.org/resource/Invasive_plant>
<http://dbpedia.org/resource/Invasive_animal>
<http://dbpedia.org/resource/Invasive_organism>
<http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000007de24>
(many other concepts)
Checking out the dbpedia resources that are the objects of the sameAs
assertions, we see that each redirects to
http://dbpedia.org/resource/Invasive_species. But other than
dbpedia:Invasive_species including a sameAs link to
freebase:Invasive_species, no dbpedia page, afaict, makes the sameAs assertions listed above.
However, http://rdf.freebase.com/rdf/guid.9202a8c04000641f800000000007de24
does assert:
<http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000007de24>
<owl:sameAs>
<http://dbpedia.org/resource/Invasive_species>
<http://dbpedia.org/resource/Invasive_plant>
<http://dbpedia.org/resource/Invasive_organism>
<http://dbpedia.org/resource/Invasive_animal>
etc.
The direction of propagation is not explicit. One possibility is that
sameas.org is inferring that "A sameAs B" based on "A redirects to B", and
that these assertions are making their way into freebase. Another is that
a freebase contributor is making the sameas inferences, and that they are
being picked up by sameas.org. (Similar cycles of sameAs can be found for
"habitat", "introduced_species", and many other concepts.)
So, a request for the sameas.org folks: Would it be possible to include a
provenance column for all sameAs assertions you keep track of? In cases
where the sameAs assertion isn't actually asserted on the web, you could
indicate the provenance as "inferred" in the provenance column. Also, have
you published the heuristics you use (if any) to infer sameAs relations?
And questions for freebase contributors: Are any of you running a script
that either a) loads in assertions from sameas.org, or b) deduces sameAs
relations from dbepedia redirection behaviour?
Thanks!
Joel.
More information about the Taxacom
mailing list