Extend the range of schema:sameAs #1065
Could you give a complete example? Currently schema.org's sameAs points to (possibly many) document / page URLs, for use in disambiguation. Unlike owl:sameAs, it doesn't mean "numerically identical", i.e. one-and-the-same-thing-as. It sounds like you want to use owl:sameAs, i.e. you have two entity descriptions in your graph and you want to say that there is just one underlying real-world entity. In both RDFa 1.1 (because of RDFa initial context, and JSON-LD (because schema.org's context file does it), the 'owl:' prefix is pre-declared. So one option would be to use use owl:sameAs directly. Another would be to remodel the situation using named relations like 'basedOn', 'translationOf'. I think that's roughly what the BibExtend group settled on for FRBR-like distinctions (works vs expressions, manifestations, items; /cc @RichardWallis).
If we encouraged the use of schema:sameAs to point to entities (people, documents etc.) directly and with OWL-like semantics, it could cause problems.
For example,
- a Person with name "Ennio Morricone"
- schema:sameAs https://twitter.com/menniomorricone
- schema:sameAs https://musicbrainz.org/artist/a16e47f5-aa54-47fe-87e4-bb8af91a9fdd
- schema:sameAs http://http://www.enniomorricone.it/
We wouldn't want to conclude that https://twitter.com/menniomorricone and https://musicbrainz.org/artist/a16e47f5-aa54-47fe-87e4-bb8af91a9fdd are two URLs for the same entity. It's rather that these two distinct pages have the same primary topic, the Person himself. For now this is reasonably clear due to the restricted definition of our sameAs (although there are some subtleties around URL, Text, and Microdata vs RDF I'm glossing over for now). I'm not sure I can see a way to broaden schema:sameAs for inter-entity links without making the above example problematic, because presumably we'd want it to then be a transitive relation. Or am I missing an option here?
Some related notes, http://schema.org/docs/datamodel.html#mainEntityBackground
Would you have a look at how Google uses mainEntityOfPage for Accelerated Mobile Pages (AMP). Our read is that mainEntityOfPage is the URL (Webpage) being marked-up. In contrast, we use mainEntityOfPage to point to a Wikipedia page to qualify the URL (WebPage) that is being marked-up. An AMP page (should be/must be) different from the canonical WebPage, so we already have to differentiate two JSON-LD structures. But, if we understand correctly, the AMP mainEntityOfPage cannot point to a qualifier, like Wikipedia, but must point to itself. Is this your understanding too?
We wouldn't want to conclude that https://twitter.com/menniomorricone and https://musicbrainz.org/artist/a16e47f5-aa54-47fe-87e4-bb8af91a9fdd are two URLs for the same entity.
When I use sameAs in examples at work, people does get it as its plain-English equivalent "same as". Granted, its definition contradicts the intuition, so I wonder why sameAs was chosen as a property name if the intention was not to say that two things are the same?
For example, the person named Morricone is not the same as https://musicbrainz.org/artist/a16e47f5-aa54-47fe-87e4-bb8af91a9fdd. You said it yourself: it's the primary topic of, or the profile page of, etc.
It sounds like you want to use owl:sameAs
Yes owl:sameAs could work. But it's not like I didn't consider it.
On the one hand, I work with people who don't fully understand how RDF works. For them, schema.org is Structured Data. So if something is coming from another ontology, the typical reaction is "mehhhh", just like with http://purl.org/goodrelations/v1#Monday (hence #921).
On the other hand, as somebody with some RDF-knowledge, I am actually avoiding using OWL concepts because there is a lot of political/technical baggage behind it (why doing that was discussed at length when working in the LDP WG). We do not need equality-inference à la OWL and we don't even use an RDF-store per say. We just use RDF+Schema.org to communicate data to 3rd parties (not just Google).
Could you give a complete example?
{
"@context": "http://schema.org",
"@id": "urn:apple:1234",
"@type": "Website",
"url": "http://www.apple.com/airport-express/",
"name": "airportexpress",
"inLanguage": "en-US",
"sameAs": [
{ "@type": "Website", "url": "http://www.apple.com/ca/airport-express/", "inLanguage": "en-CA" },
{ "@type": "Website", "url": "http://www.apple.com/airport-express/", "inLanguage": "en-US" },
{ "@type": "Website", "url": "http://www.apple.com/befr/airport-express/", "inLanguage": "fr-BE" },
{ "@type": "Website", "url": "http://www.apple.com/benl/airport-express/", "inLanguage": "nl-BE" },
{ "@type": "Website", "url": "http://www.apple.com/ca/fr/airport-express/", "inLanguage": "ca-FR" },
{ "@type": "Website", "url": "http://www.apple.com/fr/airport-express", "inLanguage": "fr-FR" },
{ "@type": "Website", "url": "http://www.apple.com/jp/airmac-express/", "inLanguage": "ja-JP" }
]
}We have a similar story for products.
Or am I missing an option here?
I see at least the following options: (not saying they are equal nor exclusive)
- the use-case is rejected
-
schema:sameAs's definition is extended - a new property is created to express that 2 entities are the same
- we recommend to use
owl:sameAsfor that use-case e.g. could be added in the description ofschema:sameAs - we define a new property just to mimic hreflang
- we define a new property to say that 2 products are the same in different languages
All options are open, this is a standardization group after all :-)
Just as a side note that the original use case I had could be resolved partially with what is discussed here. Another part which would need a separate issue is how to identify a term = a linguistic concept. See the example in the mail that started the thread
https://lists.w3.org/Archives/Public/public-schemaorg/2016Mar/0047.html
schema:sameAs being extended would help to relate terms but not with the identification if terms themselves. Again this is just a side note - I may open a separate issue on this later.
The example from @betehess is spot-on.
What the vocabulary currently lacks is a mechanism for declaring that there is an equivalent URL for a Thing, rather than that a certain URI unambiguously identifies a Thing (the current implementation of sameAs).
Of the options provided by @betehess I favor any of the following:
- a new property is created to express that 2 entities are the same
- we define a new property just to mimic hreflang
- we define a new property to say that 2 products are the same in different languages
These are all probably preferable to extending the range of sameAs - both for the reasons @danbri cites, and because sameAs has increasingly been used by data consumers as a catch-all for declaring URIs associated with a Thing, whether the declaration is chiefly employed to provide a disambiguating URI or not (thinking here of Google using sameAs to identify social media accounts of a Person or Organization).
And, to reiterate, the go-to and very common use case here is declaring that there is an equivalent URL in another language. Though - and I'm glad @betehess raised Product - something like translationOfmight turn out to be too limiting, as another common use case is declaring that a Thing (usually, but not only, a Product) has an equivalent URL for the same Thing, but for another region, even if in the same language.
Thanks for the added detail and full example @betehess :) People have several simultaneous incompatible notions of "same as", and we barely notice that we do, thanks to the miracle of the human brain. Computers on the other hand are pretty dumb, so we need to be a bit more picky. The clearest distinction (in a system based around uniquely identifiable entities) is between the "same entity" sense and everything else.
If the WebSite with url http://www.apple.com/airport-express/ is the same thing as the WebSite with url http://www.apple.com/ca/airport-express/ and it is also the same thing as the WebSite with url http://www.apple.com/jp/airmac-express/ then on an owl-like "same entity" reading of "sameAs" you could end up concluding that there is only one single entity which has all these URLs. And further that this entity has several inLanguage properties whose values include "en-US", "en-CA", "ja-JP". The same-entity sense of "same as" (roughly owl:sameAs) risks mushing all this together and making the data unintelligible.
How about adapting http://schema.org/isVariantOf or http://schema.org/workExample or potentially ttp://schema.org/isBasedOn to this use case? Or bring http://bib.schema.org/translationOfWork into the core? This would fall under your "a new property is created to express that 2 entities are the same" or "define a new property to say that 2 products are the same in different languages" roughly. Whether we repurpose/refine or tweak an existing relation, or add a new one, I think the basic direction is that this is better done with a specific named relation rather than via general notions of 'sameness'...
For reasons stated by @Aaranged at
#1065 (comment)
I think
http://bib.schema.org/translationOfWork
may have some issues.
Ok, let's consider an example beyond natural language.
- http://www.amazon.co.uk/Fundamentals-Brain-Network-Analysis-Fornito/dp/0124079083 is a page about a thing that is both a Book and Product.
- http://www.amazon.com/Fundamentals-Brain-Network-Analysis-Fornito/dp/0124079083 is another page about the same thing.
The former page is oriented towards Amazon customers in the UK, the latter towards US customers.
Both the book and the page(s) about the book are in English (for me). I don't want to make this about the particularities of Amazon's site but the basic idea is that we have two page URLs talking about the same real world thing. But we don't want to say that the pages are themselves a single entity (i.e. they are not owl:sameAs).
How about adapting http://schema.org/isVariantOf or http://schema.org/workExample or potentially ttp://schema.org/isBasedOn to this use case? Or bring http://bib.schema.org/translationOfWork into the core?
The closest to the intention would be schema:isVariantOf I guess. But I personally do not like a definition suggesting that there is a "base" entity (a WebPage in my example), because we want to avoid people to think that the US version of the page is the base version from which all the others are derived. Believe me, it is very a very important use-case here :-)
Also I would like the chosen/defined property to be at least reflexive, so that an entity can apply the property to itself. See more on that below.
The same-entity sense of "same as" (roughly owl:sameAs) risks mushing all this together and making the data unintelligible.
Maybe "same as" is just too close to some notion of equality and I agree that we may not want that here.
Note that in the hreflang terminology, the English word being used is equivalent, as in "You can indicate to Google that the Spanish URL is the Spanish-language equivalent of the English page in one of three ways". I believe that equivalence is the notion that we need here, as it's saying that the relation is reflexive, transitive, and symmetric, without saying that the entities are equal (equality is just a special case of equivalence).
I believe that schema:equivalent, defined as a reflexive+transitive+symmetric property, would work very well here. Thoughts?
@Dataliberate if it wasn't clear, I am just fine with re-using schema:sameAs :-)
As discussed in https://lists.w3.org/Archives/Public/public-schemaorg/2016Mar/thread.html#msg47, we should extend the range of
schema:sameAstoschema:Thingand update its definition accordingly.One use-case is to being able to say that two things (entities) are the same but in different languages. (just like
hreflang)/cc @Aaranged @fsasaki