Updating RDF Graphs with GraphQL

This document is part of the TopQuadrant GraphQL Technology Pages

This document complements Querying RDF Graphs with GraphQL and describes the features supported by TopBraid to perform updates (aka mutations) on RDF databases. In a nutshell, the system uses SHACL shape definitions to automatically generate suitable GraphQL mutations to create, update or delete objects. When changes are submitted, they are validated against the constraints defined by the shape definitions and a report with violations and suggested fixes can be returned.

Note that this design has not fully been implemented yet and will likely change.


Introduction

TopBraid takes SHACL shape definitions or GraphQL schemas as input and automatically generates an enhanced GraphQL schema that includes mutations that can be used to create, update or delete data from an underlying RDF graph database.

In the following example, we assume that the type Human is used as input schema to the system:

type Human {
	id: ID!
	name: String!
	height: Float
	friends: [Human]
}

The system internally converts this schema to SHACL and produces an enhanced GraphQL schema with the following mutations:

schema {
	...
	mutation: RootRDFMutation
}

type RootRDFMutation {
	createHuman (input: Human_Input!, graph: ID): Boolean
	updateHuman (input: Human_Input!): Boolean
	delete (uri: ID, uris: [ID]): Boolean
	report: _MutationReport
	commit (ignoreViolations: Boolean, message: String): String
}

type _MutationReport {
	addedCount: Int
	deletedCount: Int
	conforms: Boolean!
	results: [_MutationResult]
}

type _MutationResult {
	message: String
	severity: String!
	focusNode: _RDFNode!
	path: String
	value: _RDFNode
	suggestions: [_MutationResultSuggestion]
}

type _RDFNode {
	label: String!
	uri: ID
}

type _MutationResultSuggestion {
	message: String!
	confidence: Float
	applyData: String
}

input Human_Input {
	uri: ID!
	label: String
	id: ID
	name: String
	height: Float
	friends: [Human_Input]
}

An example mutation request may look as follows:

mutation {
	createHuman (input: {
		uri: "http://example.org/Humans/123",
		id: "123",
		name: "Darth Vader"
	})
	updateHuman (input: {
		uri: "http://example.org/Humans/456",
		father: {
			uri: "http://example.org/Humans/123"
		}
	})
	report {
		addedCount
		deletedCount
	}
	commit (message: "Added Luke's dad")
}

The above request will produce the JSON result below and, as a side effect, create one instance of Human in the database, and make it the father of an existing instance. The number of RDF triples that were added or deleted are reported as part of the report object.

{
	"data": {
		"createHuman": true,
		"updateHuman": true,
		"report": {
			"addedCount": 4,
			"deletedCount": 0
		},
		"commit": "Added Luke's dad"
	}
}

For those familiar with RDF, the new triples would be:

<http://example.org/Humans/123>
	a ex:Human ;
	ex:id "123" ;
	ex:name "Darth Vader" .

<http://example.org/Humans/456>
	ex:father <http://example.org/Humans/123> .

The mutations include the following fields:

  • createXY operations to create a new instance, for each published data shape
  • updateXY operations to modify an existing instance, for each published data shape
  • delete operation to delete one or more instances
  • results to produce information about the operations
  • commit to trigger the actual commit of the changes, instead of doing just a "dry run"

Create

For each published data shape in a schema, TopBraid generates a createXY mutation. These mutations take a mandatory input object that has one field for each property shape declared at the shape, and which must also have a uri.

An optional parameter is graph, which can be the URI of a named graph in which the new instance shall be created in. The system will pick a suitable default if no such graph is specified.

When executed, the create mutation produces the RDF triples mentioned in the input object, including an rdf:type triple if the shape is also a class. The value of the uri field is used to select the newly created instance.

Create operations only ever add new triples, while update operations may also remove triples (if fields are null). Note that the uri may refer to an object that is already in the database, for example because it does have other shapes (for example, an object may be both Writer and Human).

While SHACL property shapes may use path expressions of almost arbitrary complexity, the only property paths that can be used in create and update operations are one-step inverse paths. For other paths, it would be too difficult and ambiguous to fill in the intermediate triples.

Update

Similar to create operations, TopBraid generates a updateXY mutation for each published shape in a schema. This takes a single mandatory input parameter, which must specify a uri to identify the object that is being modified.

Only the values of the properties represented by the given fields are modified. So for example when a multi-valued property friends has 3 values then the object will have exactly those three values after the modification. If the value is given as null or the empty array [] then all values of the corresponding property will be deleted.

Delete

Use the delete mutation, providing one or more URIs of the RDF resources that shall be deleted. Clients must specify uri (a single ID) or uris (an array of IDs).

The following example deletes the resource with URI http://example.org/myObject, assuming it was mentioned in three triples.

{
	delete (uri: "http://example.org/myObject")
	report {
		deletedCount
	}
}
{
	"data": {
		"delete": true,
		"report": {
			"deletedCount": 3
		}
	}
}

Note that in contrast to the other operations, deleting a resource is independent of any give shape. All triples that have the resource as subject, predicate or object will be deleted.

If a deleted object is used in an RDF triple involving blank nodes (as RDF subject or object) then these blank nodes are considered to depend on the deleted object. Any depending triples, including their (possibly recursive) further dependents, will be deleted alongside of the originally deleted URI resources.

Result Reports

The report field SHOULD be present in every mutation request, and must be used after any create, update or delete fields. It produces a result object with fields that contain details about the changes that were requested.

The fields addedCount and deletedCount return the number of triples that have been added and deleted, respectively.

The field conforms returns true if no constraint violations have been found. Any modified RDF node will be validated. For example, if a resource is being deleted, this may cause violations in objects that are pointing at this resource (for example, violating a sh:minCount constraint). The operation will only be performed if no violations have been reported or ignoreViolations has been set true in the commit field, as mentioned later.

The field results returns an array of objects with details about any constraint violations produced by the SHACL engine. This array is empty if conforms is true. Further details will be provided later.

Commits and Dry Runs

If a mutation request does not have a commit field, it will only be performed as a "dry run", without causing the changes to be written to the actual database. In that case, clients would merely request the report field to learn about whether the change would cause any constraint violations. This can be used to validate an input form before it is being sent to the real database.

The commit field can only be used as the last field of a mutation request. If present, the system triggers a write of the actual changes to the database. The client has the option to pass in a message parameter, which may be used to record the change in a change history. If no message is provided, the system will auto-generate a suitable message by looking at the changes that were submitted. That message is also the result value of the field.

The commit field may have the value true for the parameter ignoreViolations. In this mode, the validation of the data against the shapes will not happen (unless explicitly requested in the report object. This allows updates that are known to violate constraints, for example because they are required as part of a larger set of changes, or because the user has confirmed that the updates should go ahead regardless. This flag should be used sparingly because it risks follow-up problems. For example, if a sh:maxCount 1 violation is ignored and an object has two values, the engine would deliver a random value in queries. In some configurations, such modifications will still be rejected.

Read-only fields

Some fields are read-only and cannot be directly updated using GraphQL. If a read-only field is part of an input object, then the values of these fields are ignored.

The built-in field label is read-only because label is dynamically derived from other property values such as rdfs:label and may produce different results per user language in each request. To modify the values of label, mutate the underlying properties directly, e.g. using the field rdfs_label or prefLabel for instances of skos:Concept.