This blog discusses approaches for using and extending FIBO Vocabulary or FIBO-V released by the EDM Council. FIBO contains definitions for many commonly used financial terms, making it a great resource for the financial services firms wanting to establish a business glossary or enrich a business glossary they already have.
FIBO is available as a full ontology and as a simpler taxonomy of terms. Recently, we have been hearing from some financial services firms that are not interested in the full logic of the FIBO ontology. Instead, they prefer to use FIBO as a business glossary or a controlled vocabulary of terms. FIBO-V was created to address this use case.
FIBO-V has a little over 1,700 concepts. Any pre-built vocabulary, no matter how large, can’t fully address users’ needs without some modifications. In adopting FIBO-V, firms will typically need to extend it by adding new terms specific to them and/or by adding information to already existing terms. The design of FIBO-V, however, presents certain usability challenges for business users needing to extend or update it. We will discuss these challenges and describe three approaches for addressing them.
At the end of the blog, you will see the download link for the file containing the work we did to conveniently support editing of FIBO-V. Alternatively, you could follow the steps described here yourself to produce the same result. Everything we did for the third and final approach is supported by the user interface of TopBraid EDG. There is no need to write queries or to manually wrangle with files.
With this, let’s start by taking a look at FIBO and its different distributions.
FIBO AND FIBO Vocabulary (FIBO-V)
FIBO stands for Financial Industry Business Ontology. It was created by the members of the EDM Council in collaboration with Object Management Group (OMG). To quote its creators, FIBO was developed to help financial firms meet regulatory requirements as a:
<…> standard ontology language that enables a common understanding of data among global financial institutions and regulators. FIBO describes the structure of financial instruments, legal entities and pricing and financial processes, enabling more efficient and flexible on-demand and scenario-based reporting across the full data lifecycle.
FIBO is available for download from the EDM Council’s website in two separate RDF distributions: RDFS/OWL as “FIBO OWL” and in SKOS as “FIBO Vocabulary“. In SKOS, every class in FIBO OWL is turned into an instance of the skos:Concept class. FIBO Vocabulary is delivered as a SKOS Taxonomy.
Let’s look at one of the FIBO OWL classes – Securities Offering:
The class is defined using the so-called OWL and RDFS axioms. These are statements that describe what properties should be present for members of a class and what values they should have. For example, one of the definitions above says that any member of the Securities Offering class must have at least one value in its has governing jurisdiction property that is a member of the Jurisdiction class. The statement is a logical restriction describing how members of two classes may be connected. These restrictions are displayed on the form above in the Definitions section.
Diagram below depicts the Securities Offering class.
If we were to create an instance of the Securities Offering class, we would be providing values for the properties defined in the model. For example, we would connect the offering with a specific jurisdiction. Some of the properties such as applies to and confers would be available for a securities offering because they are defined for the parent classes of the Securities Offering class e.g., Agreement and Offering.
Now, let’s see how the Securities Offering is represented in FIBO-V. FIBO-V transforms all classes into concepts i.e., instances of the skos:Concept class.
Logical restrictions relating pairs of FIBO classes are transformed into simple relationships between corresponding concepts in FIBO-V. This information is displayed on the form above in the Undeclared Properties section. We see, for example, that the concept Securities Offering is connected to the concept Jurisdiction via has governing jurisdiction property.
You may wonder why this information is displayed under undeclared properties.
Declared properties are those that have been defined for a given class using property shapes. SKOS defines about two dozen properties for its Concept class. For example, preferred label, alternative label, definition, hierarchical relationships, etc. FIBO-V uses some of these properties to describe concepts. It also uses over 450 of its own, “custom” properties without explicitly declaring that a concept may have such properties. Overall, FIBO defines close to 600 properties, but more than 100 of them are never used.
RDF is a flexible graph data model. We can connect any resource to any other resource (or to a literal value) using any property without specifying what properties are available for a resource of a given type. However, not defining the model/schema for our data brings many disadvantages. For example, while simply looking at a given concept is not an issue, editing data becomes complicated.
Each concept in FIBO-V only has, at maximum, 20 fields (properties) that contain data. Most concepts have 10 or less fields with values. If a user just browses the pre-built vocabulary, the data looks fine. To see values of the “undeclared properties” on forms in TopBraid EDG, select this as an option in the forms’s Settings menu.
However, a user who needs to create or update information, will not get a form that supports editing for the “undeclared properties”. Instead, a user will have to manually identify each property for which they want to create a value. Without the model, there is no way for a tool to know what data should be there. When the list of available properties is over 600, having to always select a few properties you actually need can be frustrating and error prone.
Let’s consider how this issue may be addressed.
Creating the Vocabulary Editing Environment for FIBO-V
Importing FIBO Vocabulary Download “as-is”
Note that the download file contains not only information about concepts, but also some limited information about properties. For example, here is what it contains for the is legally recorded in property:
rdf:type owl:ObjectProperty ;
rdfs:isDefinedBy fibo-fbc-fi-fi:isLegallyRecordedIn ;
rdfs:label “is legally recorded in” ;
rdfs:subPropertyOf skos:related ;
dct:isPartOf <https://spec.edmcouncil.org/fibo/FBC/AboutFBC/FBCSpecification> ;
skos:definition “jurisdiction (country, county, state, province, city) in which the financial instrument is legally recorded for regulatory and/or tax purposes” ;
fibo-vocabulary:hasDomain “Financial Business and Commerce” ;
fibo-vocabulary:hasSubDomain “FinancialInstruments” .
Since the file defines properties (i.e., resources of type owl:ObjectProperty and owl:AnnotationProperty),TopBraid EDG will detect that the file mixes data with the model/schema statements. If you try to load the file into a taxonomy in EDG, depending on the import option you have used, EDG may ask you to import this data into an ontology. The requirement to keep information about properties and classes separate from the information about data is not unique to TopBraid EDG. It is a good practice common to many vocabulary management tools. This validation check can be by-passed if you use the fast (streaming) RDF importer.
To separate statements about properties from data, we took the following steps:
- Created an ontology in EDG and included SKOS shapes into it – we called it FIBO-V ontology
- Created a taxonomy in EDG, included the FIBO-V ontology.
- Used the fast (streaming) RDF importer to load FIBO-V download into our taxonomy, then moved properties (object and annotation) into the FIBO-V ontology.
We used Transform>Move instances to perform this operation.
Alternatively, you could load the file into the FIBO-V ontology and then move concepts from it into a taxonomy. This is arguably a better sequence since the prefix declarations from the file are added to the ontology rather than a taxonomy. However, we noticed that some of the concepts in the file are missing types. Thus, they would not be moved from the ontology. More on this later in the blog when we talk about quality and consistency.
- Selected SKOS Concept class in the FIBO-V ontology and used
- Modify>Derive property shapes from instances.
This auto-creates shapes that add property definitions to the SKOS Concept class based on our concept data in the taxonomy.
We can’t show the resulting class diagram for the SKOS Concept class because, with 500 properties, it would not fit into a readable diagram. As a result of these operations, we can now get an editable form in TopBraid EDG that will fully support adding new of modifying existing FIBO concepts.
However, the form for any concept would contain roughly 500 fields – a couple of dozen properties from SKOS and the rest from FIBO itself. This does not create a user-friendly environment we would want. Note that since we auto-generated shapes, shapes for properties that are not used did not get created. Otherwise, there would be 100 more property shapes.
To improve on this, we tried three approaches described below.
1. Adding Classes
Our goal was to find a way to “break up” the properties so that the edit and search forms are not as busy. Instead of associating all FIBO properties with the SKOS Concept class, we attempted to create a light hierarchy of classes in a hope that this could sufficiently narrow properties available for members of each class. With this strategy, the obvious question is “what classes to create?”
Each FIBO-V concept has two properties: domain and subdomain. Their values are strings. For example, subdomain for Securities Offering is “Securities”. We decided to create classes corresponding to values of these properties plus a parent class FIBO Concept as a subclass of SKOS Concept. We wanted a root class because we did not want to be adding properties specific to FIBO to the generic SKOS Concept class.
Below is the resulting class hierarchy of about 40 classes:
We then, changed the type of each concept in the taxonomy to correspond to the subdomain class. And ran the shape derivation process. This time, we did this for the entire ontology by clicking on the “home” icon to navigate to the asset collection resource and use Modify>Derive missing classes. This operation adds new classes if any are missing. It also adds property shapes for the pre-existing or new classes.
This did narrow the properties available for each type of concept, but not enough. Each concept type now had between 150 and 200 properties. This is less than 500, but it is still too much. And, of course, this does not include properties that are not being used since we do not know where they should be used. The screenshot below partially shows how busy the edit form for the Securities Offering concept still looked after this change. The screenshot shows only 20 properties – more are available as you scroll down.
Most of the 200 available fields would not be filled out for this (or any other) concept. We did not feel that presenting users with such busy forms was a workable option.
We then tried a somewhat different approach. We created a class for each of the “top level” concepts i.e., for each concept that is a direct child of the FIBO Concept Scheme. We then made the corresponding concept an instance of the class. Each hierarchical child of this concept also became an instance of the same class.
This approach created about 130 classes. Still, the number of properties for many of the classes was larger than we believed would be acceptable. Further, since the shapes are derived “bottom up” from the data about all children, too often, properties presented did not necessarily made sense for a given concept. Fixing this would require creating additional new classes that are more specific to the concepts. Thus, we felt that, by itself, this was not the right solution.
Class-based approach works well when one can identify large groups of alike concepts that all have similar properties. For example, in medical vocabularies, these can be classes like Disease, Symptom, Drug, Chemical Substance or Medical Procedure, each containing a hierarchy of instances with the same properties. The broad content and structure of FIBO does not align well with this approach. We would end up with too many classes relative to the number of instances.
2. Consolidating Properties
Since our goal was to have a more manageable number of properties for a given concept, we considered if a combination of adding classes and consolidating some properties could work. For example, FIBO has applies to and applies to account properties. The object of applies to account is always Account. Thus, it would seem that having just applies to should be enough – at least, in the context of FIBO-V.
Here are a few additional examples.
In the screenshot above we see properties like has posting date and has transaction date with values Posting Date and Transaction Date – bringing up a question if a more generic date property would be sufficient. The nature of the dates is already conveyed by the properties’ values.
Below is another example – the Interest Payment Terms concept with has initial interest accrual date, has initial interest payment date, has initial interest payment date and has interest payment frequency properties.
The next screenshot shows the Principal Payment Terms concept with mirroring properties, except that “interest” is replaced by “principal”. However, it is clear from the term itself that in one case we are describing interest and in another case we are describing principal payment terms.
We believe there are opportunities to rationalize and consolidate FIBO properties. Combining creation of some classes with consolidation of properties may yield a workable solution for editing. This would mean removing some properties from FIBO and updating statements that are using them. Some organizations may decide to do this, but we did not feel comfortable making the judgement calls required for property consolidation.
3. Different Treatment For Different Properties
Looking into property consolidation caused us to analyze how often each property is actually used. To do this, we ran a query shown below. Note that properties that are shown in the SPARQL Results as long URIs are used in the FIBO-V data, but have no property definition. Thus, there are no labels to show.
Our analysis revealed that out of 486 properties that are being used, nearly 300 are used only for a single concept. Another hundred properties are used only for two or three concepts. Only 21 properties are used for more than 18 concepts. Out of 21 only 10 are specific to FIBO, the rest come from SKOS or RDF/RDFS. Except for a handful of annotation properties, all FIBO properties are defined to be subproperties of skos:related.
The image below shows how quickly the property usage dwindles down beyond a few very broadly used properties. The table is sorted by count – from largest to the smallest number. Highlighted row identifies the “last” property that is used more than 18 times.
Based on this analysis, we decided on the following approach:
- In the ontology, create class FIBO Concept as a subclass of skos:Concept so that FIBO-specific property definitions could be added to it.
- Add to the class property shapes defining a dozen of the broadly used FIBO properties, including all annotations, plus the shapes for rdfs:isDefinedBy and dct:isPartOf properties.
The cutoff to include any relationship that is used more than 18 times (i.e., for more than 1% of concepts) is arbitrary. Alternatively, we could have created shapes only for properties that are always used plus the annotations. This would result in creating just 7 property shapes instead of 14.
- Create class
- FIBO Relationship
- as a subclass of owl:ObjectProperty.
We are doing this to support selecting these properties as annotations for skos:related.
- FIBO Relationship
- statement to all the object properties defined by FIBO except for the commonly used relationships we created property shapes for.
The easiest way to do this is to use the RDFS/OWL Properties List panel. Select all object properties in the panel, then deselect the few for which property shapes have been defined and use Edit assets from the <…> (more) menu to add the type.
- Add a view node shape reifying skos:related – to let users annotate this relationship by selecting a more specific FIBO property and adding other information, if they needed to.
See our recent RDF-star blog on how to create view shapes to support adding facts about facts.
Operations above completed creation of the ontology for FIBO-V. Here is the diagram showing the class FIBO Concept.
The only change we made to the taxonomy was to change the concept types to the FIBO Concept from the SKOS Concept. We accomplished this by running simple update queries. As an alternative to writing a query, you could select all SKOS Concepts in the Concept Search panel and use Edit assets.
If you do not want to change types of the FIBO concepts, you could add property definitions directly to skos:Concept. However, generally speaking, we recommend adding a custom class when extending SKOS.
Let’s take a look at the results. Below is a concept that, like most FIBO concepts, only has a few broadly used properties. These FIBO specific properties are shown in the FIBO Properties section of the form.
Now, let’s take a look at a concept that is described using some of the less common properties for which we have not defined shapes. Values of these properties show under the Undeclared Properties section.
To avoid further changes to the FIBO-V, we kept already existing values of the less commonly used properties as-is. In other words, we did not convert them into the skos:related links with annotations. For the purpose of this exploration, we assumed that there will be no need to edit these values. Having said this, for consistency, we could run a query to turn them into values of skos:related.
We will now add a relationship using our new approach. Right now, the form only shows declared properties that have values. (This is a setting that could be changed under the Settings menu.) When we press the Edit button, all applicable properties show up. We can pick a value for skos:related and annotate the value with additional information.
Note that in selecting annotations for skos:related, users can pick any of the FIBO relationships – including those that have not yet been used. Since the commonly used properties all have pre-defined shapes, selecting among other properties for annotation does not present much inconvenience. It will be done relatively seldom.
We provided two more fields for additional information:
- free text note for any relevant information
- field to type in the relationship name as text – in case an appropriate relationship does not exist yet or a user can’t find it.
This input can be used by the vocabulary curation team to decide, for example, to create a new relationship or give a better name or definition to already existing relationship.
Let’s return to our original concept – Securities Offering, and add a new relationship to it. Since this concept already has a lot of information, for convenience of screenshooting and editing, we switched to the Tab-oriented display of the form (another option under the Settings menu) and added a new relationship. Here is how it looks when saved.
The only item left to show is creating a new concept. This is done by selecting a parent and clicking on the option to add a child. As you can see the Create New dialog includes the required fields – the fields that we, in the model, specified as needing at least one value.
We feel that the approach presented in this option addresses all the requirements for being able to adopt and evolve FIBO-V. It is our preferred and recommended approach for extending FIBO-V.
Ensuring Quality and Consistency of the Vocabulary
We prepared export of the RDF file implementing the third and final approach.
Prior to exporting, we ran a consistency check on both, the ontology and the taxonomy. We found over 100 issues. Issues were primarily of two kinds:
- Missing human readable labels for resources – this was true for many FIBO properties and for a few concepts
- Missing any type for a resource e.g., a resource has the skos:broader relationship making it possible to guess that it should be a concept, but there is no type statement
TopBraid EDG makes it easy to not only identify such problems, but also to fix them by applying suggested fixes one by one or all at once. The screenshot below shows the output of the validation prior to applying fixes.
This is another example of why having a model is important. It defines how data should look like, making it much easier to deliver high quality controlled vocabularies that comply with the enterprise standards.
Two issues were left in the taxonomy as we were not sure of the best way to resolve them. To see the issues, switch to the Problems and Suggestions layout and run validation.
In this blog, we demonstrated how TopBraid EDG can be configured (just through modeling) to support a rather unusual use of SKOS taxonomies. If you would like to test the recommended approach described in this blog, download the TRIG file containing:
- FIBO-V Ontology
- Modified FIBO-V Taxonomy that does not contain properties and uses the FIBO Concept class
There is one remaining question that is interesting to raise:
Should the seldom used properties even remain as properties in FIBO Vocabulary?
An alternative would be to create a subclass of FIBO Concept called FIBO Relationship Concept. Looking at the information about the fibo-v-fbc:isLegallyRecordedIn property (shown earlier in the blog), you see that it represents a business term. With the approach defined here, it does not need to be a property. It can simply be another concept used to annotate values of skos:related property. If FIBO-V is to be used as a business glossary, the terms it contains have to support mapping data assets to business terms. This brings up questions like:
- What term do you map the data element containing posting dates – to the Posting Date FIBO Concept or to the has posting date FIBO Relationship?
- When users need to understand the meaning of the “posting date” concept, which concept and definition do they look at?
These use cases need to be taken into account when thinking about consolidation of properties.
Further to this topic, some users may want to represent FIBO-V as a Glossary in TopBraid EDG instead of the SKOS Taxonomy. If you are using TopBraid EDG Metadata Management package, this would be the preferred route to take. It can be accomplished by making FIBO concepts members of the edg:BusinessTerm class. This class borrows some of the most commonly used SKOS properties such as skos:broader. As a result, little to no additional transformation would be required. edg:BusinessTerm also has a number of other useful properties, commonly needed when creating a business glossary to support data governance initiatives.
One final thought is that, when using the approach presented here, if necessary, statements like Securities Offering is governed by Security Regulation can be generated for display. edg:BusinessTerm class has edg:businessRule property that is designed to hold “Verbal description of the operations, definitions and constraints that apply to the glossary term”. Using a SHACL Property Value Rule text strings such as “Securities Offering is governed by Security Regulation” can be dynamically generated as value of this property from the skos:related values and their annotations.