Many of our customers use TopBraid EDG to capture information about their data and computer technology ecosystem. This includes data sources, business applications, stakeholders and processes they support. To make this possible TopQuadrant developed a suite of interconnected models – TopBraid EDG ontologies. In this blog, we will take a look at them.
The models are extensive. They contain hundreds of classes and an even greater number of properties. Classes and properties are described using SHACL – W3C standard for describing Knowledge Graph schemas. This is an overview to get you oriented.
EDG models contain two key types of classes: Asset classes and Aspect classes:
- Asset classes are easy to understand – they define all the asset types e.g., Database, Dataset, Report, Software System and so on. These are concrete things you will find easy to relate to.
- Aspect classes are more of an abstract notion – they have names like Narratable, Identifiable, Processable, etc. They are used to organize different qualities (aspects) that some of the assets may inherit as “traits”. Using aspect classes makes it easier to manage and maintain large and complex ontologies. Normally, you would not want to create direct instances of the aspect classes.
This will become clearer if we look at the example below.
Asset is a class of type Asset Class. It is a subclass of Status Aspect (perhaps, a better name would be “Statusable”, but it is not a word), Narratable and Identifiable. This means that all assets are:
- “Statusable” – can have a status and associated dates
- Narratable – can have description, purpose, etc.
- Identifiable – can have acronyms, labels, identifiers, etc.
Subclasses of Asset are more specific types of assets – as shown below:
Each of the subclasses of Asset has its own set of properties in addition to the properties they get as subclasses of Asset. For example, let’s take a look at the class Requirement.
Since this class is a subclass of the Asset, a requirement would have properties defined for the Asset class. Additionally, Requirement class defines a number of properties that are specific to the requirements. It also has a number of subclasses – for different types of requirements.
And, as you would guess, if we were to look at other classes, we would see that many have a connection to requirements because most assets we are keeping information about have some associated requirements. For example, as shown below, Enterprise Asset is a subclass of another Aspect class – Traceable.
Traceable are all things that can be connected to requirements and mapped to the business glossary terms. Enterprise assets are thing like forms, reports and processes. Technical assets and data assets are also traceable.
Let’s come back to the immediate subclasses of Asset. With one exception, each of these classes corresponds to a type of an Asset Collection. In other words, each class is the main entity for a certain category of asset collections. Going from left to right in the first diagram, these classes are:
Requirement – the main entity for the Requirements asset collections. These are catalogs of requirements. There can be multiple Requirements asset collections. The separation can be along the subject areas or along the type of requirements.
Technical Asset – this can be a software asset or a hardware asset. Software assets further partition into a number of subclasses such as applications, systems, software modules and software executables. It is the main entity for the Technical assets collections. These are catalogs of technical assets. There can be multiple Technical Asset asset collections. The separation can be along the types of assets (e.g., hardware versus software) and/or along the subject areas.
Governance Asset – assets used to describe organization’s governance framework. Further partition into subclasses such as Metric, Policy and Governance Process. It is the main entity for the Governance Model asset collection. This is a special asset collection in TopBraid EDG in that there is only one Governance Model collection for a given installation of TopBraid EDG.
Glossary Term – the parent class for terms that are specialized as Business Term, Industry Term or Technical Term. It is the main entity for the Glossary asset collections. There can be multiple glossaries.
Enterprise Asset – includes such things as Business Activities, Business Functions, Business Capabilities, Job Roles, Organizations, Parties, and Information Assets. It is the main entity for the Enterprise Assets asset collections. There can be multiple collections of this type.
Datatype – the parent class for all datatypes: scalars, enumerated values including scales, and structured types. It is the main entity for the Datatypes asset collections. There can be multiple collections of this type.
Data Asset – any data item that is of value to the enterprise. It can be a Database, a Dataset or a Data Element. It is the main entity for the Data Assets asset collections. There can be multiple collections of this type. Some Data Assets collections can be dataset catalogs, other can be holding information about logical models and so on.
Big Data Asset – the parent class for things such as Big Data Data Assets, Big Data Configuration Assets, Big Data Jobs and Big Data Files. It is the main entity for the Big Data Assets asset collections. There can be multiple collections of this type.
Lineage Model – establishes a context for determining how enterprise capabilities, business functions and information assets are dependent on data flows, applications, software executables and data transformations across data sources and sinks. It is the main entity for the Lineage Model asset collections. There can be multiple collections of this type.
TopBraid EDG ontology models are ready for use, out of the box. They are partitioned with “EDG Schema – Base and “EDG Schema – Core” containing commonly used classes. Then, ontologies corresponding to each asset collection type e.g., EDG Schema – Data Assets. You see these models below in the Includes dialog in EDG. “Other” in the Collection Type column means that this graph is in a file in the workspace and not in an asset collection.
When you create a new asset collection, an ontology model for the type of collection is automatically included. You can always view the included classes and property definition by adding to the Editor view the Class Hierarchy panel.
Models are open for extension and modification. You will be able to look at the underlying models, but will not be able to make modifications to them in let’s say a Glossary or a Data Asset Collection since they contain the business terms and data assets respectively – ontologies are simply included by reference. To modify one of EDG ontologies, create a new Ontology asset collection in EDG and include in it the EDG ontology you want to extend or modify. After doing this, you will be able to:
- Create new classes as subclasses of EDG classes
- Add new property definitions
- Deactivate the pre-built property definitions – this may be necessary if you do not want to use some of the pre-defined properties
- “Hide” classes you don’t want to use
- Use SHACL node shapes to build role-specific views for some of the classes
- And more …