Neo4j Integration
Starting with version 8.4 TopBraid EDG includes an integration with the Neo4j property graph database. In contrast to the RDF-based databases that TopBraid can connect to using SPARQL (see, for example, Working with Remote Data Sources), Neo4j is not based on RDF and uses a slightly different internal data model and a rather different query language (Cypher).
The integration currently includes
An ADS API to execute Cypher queries and updates against a Neo4 database (see
IO.neo4j()
as a starting point)A Push data integration that can be used to export parts of an asset collection into a Neo4j graph
Configuring Neo4j Connections
Before anyone can connect to Neo4j, a Power User or Administrator needs to set up a connection on the Product Configuration Parameters Admin Page. There, go to the Neo4j section and add an item to the list of Neo4j databases using the + button on the right hand side. In the dialog, enter any suitable Label for the connection, and let the system fill in the ID.
As Neo4j database URL enter the URL of the database.
For a local installation of Neo4j Desktop, this would typically be bolt://localhost
.
Also enter the user name for the database, such as neo4j
.
Once you have closed the create dialog and saved the changes, click on the new connection instance and enter and save the Neo4j database password.
Optionally, you can enter the Neo4j database name in case you do not want to map to the default database.
Finally, you should add the names of the TopBraid user accounts that will get access to the Neo4j instance through TopBraid. This access will include read and write access, so make sure you trust these users.
Once the connection has been created, note the local name of the connection, which is the part of the ID right after product_config
.
Exporting Assets from TopBraid Collections to Neo4j
If you want to enrich (or populate) a Neo4j graph from assets edited in TopBraid EDG, use the Data Integration Buttons to select Link to Neo4j Database…. This will open a dialog in which you can select one of the Neo4j databases that you have been given access to.
Once you have linked the asset collection, click on the resulting Data Integration instance to visit its form. The form has most properties already filled in, including the database id. Most importantly though, you should add one or more classes to the included classes list. The instances of these classes (and their subclasses) will be exported to Neo4j.
If you only want to export a subset of those instances, you can specify an instance filter, which need to be a SPARQL expression
that evaluates to true
for any given instance (variable $focusNode
).
Note
The mapping from an asset collection to Neo4j covers all instances of the included classes from both the current asset collection and all its includes. For any given set of collections, you should only have a single Neo4 integration. If you want to map multiple asset collections into the same Neo4j instances, if necessary, create a dedicated container collection that includes everything that shall be exported.
On the form of the data integration you can also select excluded properties that shall be ignored by the exporter.
You may need to reload the browser page after you have configured the data integration. Then you should see a Push button (cloud arrow up icon) from which you can start the export.
To overcome the differences between the RDF/SHACL model used by TopBraid and the property graph model used by Neo4j, the following transformations apply:
Each instance of the included classes becomes one node in Neo4j with a unique property
uri
and a human-readable label as value ofname
.These Neo4j nodes will have one label for each
rdf:type
of the instance and all its superclasses.The node labels are by default the local names of these classes, e.g.
Concept
forskos:Concept
and always includeResource
.To control the selection of labels, for example to avoid conflicts with labels used elsewhere in the graph, set a value for Neo4j label at the class in TopBraid.
The Neo4j nodes will keep their identity until the asset with that URI is deleted. However, each push will replace all its properties and relationships.
This means that it is perfectly fine to create relationships between other Neo4j nodes that are not managed by TopBraid to those that are.
Hint
Exporting asset collections to Neo4j means that you can combine data from high-quality sources such as Taxonomies with rather transactional data that may get sent to Neo4j from 3rd party processes.
The properties that are exported to Neo4j nodes must be declared by SHACL property shapes that have a sh:datatype
or an sh:or
of datatypes.
The names of these properties are based on the local name of the predicate, but this can be overloaded by declaring a graphql:name
for the property shapes.
The following mapping rules apply:
integer-based numeric values such as xsd:integer become Neo4j long numbers
numeric values with decimal points become Neo4j double numbers
xsd:date
,xsd:dateTime
andxsd:time
literals become corresponding Neo4j valuesrdf:langString
literals (with language tag) by default become just the simple strings unless keep language tags has been set trueall other datatype literals are mapped to their lexical form
The relationships that are exported to Neo4 must be declared by SHACL property shapes that have a sh:class
or an sh:or
of classes.
The names of the relationships are following Neo4j naming conventions, with all-caps and underscores.
There is no support yet for reified values - these could become relationship properties in the future.
Note
After the initial export/push to Neo4j, TopBraid will by default only update the nodes that have been the subject or object of a change in the change history since the last push. This means that updating will be incremental and typically very fast. To overwrite everything, navigate to the Data Integration instance (for example by clicking on the name of the integration in the Push dialog), then either set always overwrite to true or delete the last push start time and last push end time.
Example Cypher Queries
MATCH (r:Resource)
RETURN r
MATCH (c:Country {uri: "http://topquadrant.com/ns/examples/geography#Nicaragua"})
RETURN c.name, c.callingCode
Working with Neo4j from ADS
Once a Neo4j connection has been configured (see further above), any user with access permissions to these connections can use the function IO.neo4j
to
interact with the Neo4j instance. The API is hopefully self-descriptive using auto-complete.
Note
The push data integration above has been completely implemented in JavaScript using the ADS framework.
You can actually find the source code of the mapping rules in the included script neo4j:Neo4jFunctions
.