Questions about SHACL
Q1: Can you elaborate on the difference between Constraints and Rules in SHACL?
Constraints are declarative statements that are typically used to produce validation results (constraint violations), e.g. the fact that a person has more than two parents. Constraints can also be used for other purposes such as driving user interfaces (e.g. to restrict an input area to only two values). Rules are used to derive (or “infer”) new information from data that is given as input to the SHACL rules engine. In RDF terms, rules can add new triples to the data. For example, a rule may compute the grand parents of a given person by following up the parent relationship, or do mathematical calculations of the area of a rectangle.
Q2: What commercially available products are available that approach the functionality of SHACL?
As of September 2017, we believe TopBraid Suite to be the only commercial product to offer SHACL support. There are several open source implementations. We do not know the extent to which they may offer commercial support. A few vendors said they would shortly offer implementations.
Keep in mind that SHACL is still very new – it only became a recommendation at the end of July, 2017. It was designed to make it easy for vendors that already support SPARQL to support SHACL. With this we expect that, as long as users ask for it, a good number of vendors will offer it. We would encourage you to request SHACL support from your preferred vendor.
For information on the support known to be available, see: implementations list on the W3C website.
For information on SHACL support in TopBraid, see:
- Using TopBraid as a data validation server.
- Working with SHACL Shapes in TopBraid EDG and TopBraid EVN.
- Working with SHACL Shapes in TopBraid Composer.
Q3: Could this be used for data modeling for different data platforms including relational databases, graph databases, hadoop…?
This question is quite complex and would be best handled in a dialog/discussion, e.g. on the SHACL mailing lists or community group. The data model behind SHACL is more flexible than either relational database schemas or most graph databases. It should however be possible to define subsets of SHACL for specific target scenarios. Also, relational databases can be mapped into RDF graphs, see R2RML or D2RQ, which means that SHACL constraints can in principle be executed directly on databases.
SHACL can be used as a schema language for JSON, in particular via JSON-LD. In TopBraid EDG we are using SHACL to generate Avro schemas from relational databases.
Q4: Can you tell us more about SHACL properties specifically for supporting user interfaces?
In addition to the constraint properties, such as sh:datatype and sh:maxCount, you can specify default values for a property, order of properties on a form, sub-groups on a form layout and a few more things that are useful for the UI. See more details here.
Q5: What are the limits on extensibility? Are there shapes or patterns that cannot be specified?
Q6: With regard to strings, is there support for language tags?
You can use rdf:langString as value of sh:datatype to test if value nodes have a language tag. Additionally:
- sh:languageIn lets you specify the allowed list of language tags for a given value node.
- sh:uniqueLang can be set to true to specify that no pair of value nodes may use the same language tag.
And you can use SPARQL constraints to do more with language tags. See examples here.
Q7: Should a SHACL implementation be expected to be aware of all the rdfs:subClassOf and rdf:type relations defined in the vocabularies used by the data graph? (This would be helpful to avoid having to explicitly append all of those relationships defined in the vocabularies to the data graph being tested or making the shape graph excessively bloated)
Yes, these statements should be included in the data graphs. However, you do not need to append them, you can refer to the graphs that contain these statements using owl:imports.
Q8: Are you aware of any tools available that can convert an ontology into a corresponding SHACL shape graph for validation purposes in a fully or semi-automated manner?
Yes, TopBraid Composer (TBC) offers a convertor. Open a file you want to convert and select the convertor under the Model menu in TBC.
The initial version of the convertor became available in TopBraid Release 5.3.2, but we recommend waiting for the upcoming 5.4 Release since it offers a more mature and extended convertor. You should be able to get the beta of 5.4 by early October, 2017. If you want to see a preview and help us fine tune the converter, feel free to send us your OWL ontologies.
Q10: Why does the sh:pattern “^http://…” start with “^”?
In regular expressions, the ^ symbol serves as an anchor for the start of the string. So for example, “testhttp://” would not match “^http”, but http://test would.
Q11: Can I specify that the weight of every dog must be smaller than the weight of any person?
Yes, with SPARQL-based constraints you can. See this tutorial on creating your own constraint components. In this case, you will probably use as a target all subjects of ex:weight property that are of rdf:type ex:Dog.
When both properties you want to compare are attached to the same focus node (e.g. a passenger may take a dog with them on board of an aircraft, but it must weigh less than a passenger: ex:Passenger1 ex:personWeight X and ex:Passenger1 ex:dogWeight Y) then you could use sh:lessThan to specify constraints for these two properties.
Q12: You mentioned that SHACL can be used for data integration. Can you please elaborate more on this as to whether we can use SHACL to integrate data from data silos?
This is a complex question that may benefit from examples and a longer discussion depending on your specific use cases. Scratching the surface, let’s assume you have two different databases, one about persons and one about customers. You could define two ontologies for them, with shared superclasses and/or shared properties, and we assume there is a way to turn your data into instances of these classes. Then you could use SHACL to define constraints that go across those silos, because ontologies can be mixed together, e.g. by matching by full name or social security numbers. For mapping from one RDF ontology to another, SHACL includes rules, either written in SPARQL (CONSTRUCT) or via SHACL Node Expressions. The latter has been particularly designed for mapping scenarios and visual diagrams. SHACL rules can use SHACL constraints/shapes to limit the preconditions and select specific instances to which the rules apply. Visit this page.
Q13: The shapes creation demo shows implicit class target. Does the TopBraid Enterprise Data Governance tool support explicit class target when creating a shape?
The EDG Ontology editor is class-centric and optimized for that design. However, in the upcoming TopBraid Release 5.4, you can in also create stand-alone node shapes and edit their sh:targetClass etc individually.
Q14: What are the choices of SHACL engines?
Please see the answer to Q2 above about commercial support
Q15: How do we reference one/multiple ontologies in the SHACL?
SHACL supports owl:imports statements. You can use it to reference ontologies. These statements can be included in shapes graph and in data graphs. If you will be relying on rdfs:subClassOf statements for targeting and these statements are in an ontology, you need to make sure to reference it from your data graph.
Q16: Is the constraints violations report also in RDF?
Yes, it is in RDF. Learn more on its structure here.
Q17: What levels of SHACL support are there in TBC versions, e.g. 5.2 or 5.3?
TBC 5.3.2 offers full support as it was released at the time the standard was finalized. TBC 5.4, as stated above, will offer more capabilities (e.g., auto-conversion of OWL), but these are convenience features.
Versions prior to 5.3.2 tracked the standard development process, so their SHACL support will be somewhat different from the final version.
Q18: Could this be extended to do data modeling outside the semantic web space? Modeling and validation of data exchange formats (e.g., xml, json, json-ld, edn, …)
JSON-LD is RDF, so works natively. Many other JSON models can be turned into JSON-LD using a suitable @context. XML structures have a straight-forward mapping to RDF.
Q19: Is there any antipatterns to be aware of? Where are discussions of best practices happening?
The SHACL syntax offers more than one way to express the same things. It is up to the target audience to decide which syntactic structures are not desirable given their use cases.
For example, a property shape can be specified in-line in a node shape as a blank node. We typically recommend giving it a URI, so it can be properly referred to across graphs. Another example is that in addition to property shapes to be referenced in the node shapes and used with targets specified in the node shapes, property shapes can also be used as top-level entities with their own targets. However, most other modeling languages use something like classes that have properties. So if the goal is to be close to those other modeling languages and target platforms, then stand-alone property shapes with their own targets may be an anti-pattern.
The SHACL Community Group is the right forum for discussing best practices. At this point the current charter of the Working Group has been fulfilled. It has been extended through next spring, but only so that we can deal with any issues that may come from the user community. Any new work in the near future will be happening in the Community Group, leading to potentially chartering a new Working Group for SHACL 1.1 or SHACL 2.0.
The Community Group is also a better forum because anyone can join it while Working Group participation is limited to only W3C member organizations.
Q20: And are the JS validations generated by the SHACL structures or do you have to carry explicit JS in the structure?
SHACL-JS engines produce the same validation results (in an RDF vocabulary that can be represented as JSON-LD) as the other variants, SHACL Core and SHACL-SPARQL. The SHACL-JS validators don't have to know about the details and may, for example, just return the string of an error message. The SHACL-JS shapes graphs do not carry JS code but point at JS functions stored in .js files. Not sure if this answers your question.
Q21: The standard says that SHACL has no formal semantics. Is this still the case?structures or do you have to carry explicit JS in the structure?
SHACL definitely has formal semantics. These are written in the spec as TEXTUAL DEFINITIONS and sometimes SPARQL. The interpretation is unambiguous and standardized across platforms. A limited number of features (such as support for recursion) were left undefined (meaning that implementers can either not support the feature or create their own definition) because the SHACL WG could not agree on semantics for that during the time of the working group.
Q22: Are popular triple stores such as GraphDb, Allegrograph compatible with Shacl-data?
Yes, SHACL is RDF so it is fully compatible with any RDF store. SHACL processors can operate on any RDF database. If the database provides a Jena adapter then the TopBraid SHACL API can be used directly. The performance of those may, of course, not be ideal so you may want to inquire with your preferred vendor concerning their plans to support SHACL natively.
Q23: You mentioned support for other languages- does that include C#?
Q24: Any tools to leverage existing XML schema (xsd) by smartly importing it?
TopBraid products do provide importers from XML Schema or just XML (instance) files to RDF. The XSD import creates RDFS and OWL statements which can then be converted to SHACL. One step translation from XML Schema to SHACL (constraints) is planned.
Q25: How will this fit into linked data?
The whole architecture of SHACL is very linked data friendly. SHACL shapes graphs are represented as RDF and therefore can be shared on the web as linked data. Data graphs or instances can reference the shapes that they are supposed to obey, e.g. using the sh:shapesGraph property. Finally, even the constraint components (definitions of the constraint types such as sh:minCount) are represented in RDF, so that if someone defines new constraints or other extensions, then these can also be looked up on the web.
Q26: Can you say more about SHACL support URI generation?
SHACL constraints and rules can use SPARQL, e.g. with built-in functions such as IRI and CONCAT. This means that SHACL rules can construct new URIs on the fly.
Q27: Does that mean rules can involve multiple fields while constraint is single field?
The difference between rules and constraints is that rules create new triples. Constraints can work with multiple fields. For example sh:lessThan can be used to relate ex:startDate and ex:endDate. Also, shape definitions can group together multiple property shapes, and then these shape definitions can be referenced with sh:node, sh:property, sh:qualifiedValueShape etc. Finally, SHACL-SPARQL or SHACL-JS can express almost arbitrary relations including those that look up values elsewhere and compare them with local values, or walk property path expressions.
Q28: You mentioned that the community may take on other features/work–what is the governance mechanism of the standard?
The governing mechanism is the W3C process. More details here.
Q29: I really don't see how rules can be related with SHACL, can you elaborate on this a little?
It is not surprising that there is interest in rules. However, it would be too ambitious to cover this in the intro webinar. We are planning to present follow up webinars and will include one on rules. In the meantime, you can: learn more about SHACL rules here.