• Blog
  • Downloads
  • Purchase
  • Contact
TopQuadrant, Inc
  • Solutions
    • Wrap Column 1
      • Learn How Data Governance is Better Enabled using Knowledge Graphs
      • Manage Taxonomies or Ontologies
      • Manage Your Reference Data
      • Lower Cost Of Regulatory Compliance
    • Wrap Column 2
      • Manage Metadata and Data Lineage
      • Bridge Metadata Silos to Connect Business and Technical Data
      • Improve Search
  • Products
    • Wrap Column 1
      • TopBraid Enterprise Data Governance (TopBraid EDG)
      • TopBraid EDG Packages
      • ► EDG–Vocabulary Management
      • ► EDG–Metadata Management
      • ► EDG–Reference Data Management
      • ► EDG–Business Glossaries
    • Wrap Column 2
      • TopBraid EDG Additional Modules
      • ► TopBraid Tagger and AutoClassifier
      • ► TopBraid Explorer
      • IDE:
      • TopBraid Composer – Maestro Edition
  • Services
    • Wrap Column 1
      • Professional Services
      • Solution JumpStart Program
      • Support Portal
      • Learning Center
    • Wrap Column 2
      • Training Overview
      • TopBraid EDG Training
      • Intro to Semantic Standards with TopBraid Composer
      • TopBraid Technology Training
  • Resources
    • Wrap Column 1
      • Product Videos
      • Webinars
      • White Papers
      • Learning Center
    • Wrap Column 2
      • Product Documentation
      • FAQ
      • Support Portal
      • Case Studies
  • Company
    • Wrap Column 1
      • Customer Case Studies
      • News and Events
      • Careers
    • Wrap Column 2
      • About Us
      • Partners
  • Technology
    • Wrap Column 1
      • TopBraid Technology Overview
      • TopBraid Teamworks Framework
      • TopBraid Data Platform
      • What Are Knowledge Graphs?
    • Wrap Column 2
      • SHACL
      • GraphQL
      • SPARQL Web Pages (SWP)
      • SPARQLMotion
      • SPIN (SPARQL Rules)
Select Page

Q&A from the WEBINAR: “What are Knowledge Graphs? Why are they Important for Data Governance?”

by Robert Coyne | Mar 18, 2019 | Uncategorized, The Semantic Ecosystems Journal, Linked Data, Data Governance, TopBraid EDG, TopBraid EDG, Semantic, metadata management, Governance, Knowledge Graphs

Example Knowledge Graph We fielded several questions as part of our recent webinar (recording and slides available here): What are Knowledge Graphs? Why are they Important for Data Governance?

Questions from the webinar included:

Q1: Can we manage discrete separate knowledge graphs?

Yes, each asset collection is created as its own graph e.g., a specific glossary. You can then flexibly include these collections/graphs into each other, creating larger assemblies of graphs.

Q2: Do you have a way to select one of the vocabulary terms from the different sources as a “preferred term” within the context of the enterprise?

We see this as possible additional metadata on the terms. There is a pre-built property for Terms that is called “used by organization”. It could be used to indicate that what group within enterprise a specific term is used by, e.g., used by the sales department, support department, etc. For example, we could have said that “Client” from the ITIL glossary was used by the “Information Technology Division” and “Client” from the SEC glossary was used by the “Investment Services”.

This property could potentially be used to say “Enterprise-wide” – meaning that this is the preferred term. Or an additional property can be created as a Boolean to indicate if the term is preferred in the enterprise. Preference can also be contextual, so this may require an additional structure to capture it. With EDG, users can freely extend the underlying models to suit their needs.

Q3: When was the term ‘knowledge graph’ coined or introduced? Any idea of the first usage of the term?

We believe that Google first used the term in 2012, specifically to describe their knowledge base. Today, this term and the related term “Enterprise Knowledge Graph” are used to describe an interconnected (linked) set of information that meaningfully brings together data and metadata silos. For more information on Knowledge Graphs, take a look at this white paper.

Q4: Is client always a Person? Party? Role?

It is up to each enterprise and/or a group in the enterprise to define what they may mean by the term Client. In the example we used from ITIL , client is defined as “A generic term that means a Customer, the Business or a Business Customer.” US Security and Exchange Commission defines client as “Any of your firm’s investment advisory clients. This term includes clients from which your firm receives no compensation, such as family members of your supervised persons. If your firm also provides other services (e.g., accounting services), this term does not include clients that are not investment advisory clients”.

Human language is highly contextual and often the same word is used to mean different things. In the EDG knowledge graph, we use RDF data model and standard where each resource has a unique URI. Thus, identity and meaning of a resource are not determined by its label. There can be two or more different resources called “client” that mean different things and each will have its own unique, unambiguous identity. You could then specify how each of these resources relates to each other and to other things in your enterprise.

Q5: Who does the mapping between graphs? Can this be automated?

Mapping can be and is automated. For example:
• EDG automatically creates crosswalks between different taxonomies and ontologies and between private and public information such as Wikidata. Click here for example.
• It can also auto-map data elements to glossary terms based on rules that describe the meaning of a term. Click here for a demo.

Other types of reasoning could also be used to automate the mapping process.

Q6: How far would I be, from a correct understanding of TopBraid EDG, if I call it “real-time semantic enterprise data aggregator”?

EDG is semantic and it crosses silos of enterprise information.

Having said this, “real-time aggregation” of data from multiple sources may mean real-time answering of queries that require on-demand combining of information that lives in different sources. For example, a request for a 360-degree customer view where a system gets as input some info about a customer and needs to go to multiple databases, get information about this customer, merge it together and present it to the requestor. EDG “out of the box” is a data governance platform. It has data conversion to RDF capabilities, rules and workflows, so one could potentially use it to create a 360-degree view of a customer, but this would require system integration work.

Q7: Does the “edg” namespace contain the key Classes and Properties that EDG provides? Are there any other edg-related namespaces?

Yes, correct.

EDG also uses SKOS for Taxonomies. RDFS, SHACL and OWL are used for ontologies.

There are some “utility” models in the server.topbraidlive.org project, especially under /dynamic folder in this project. Also in the TopBraid project – for example, under /TBC folder there are models used to support conversion from different formats to RDF.

Q8: No exports to OWL?

All information in EDG knowledge graph is stored as RDF. Information can be exported in one of standard RDF serializations of user’s choice.

In the webinar, we have been using mostly RDF data and only showed ontologies tangentially. TopBraid EDG supports OWL. OWL is defined in RDF. Thus, one could have OWL ontology in EDG and export its RDF serialization. Having said this, for a variety of reasons, we are increasingly using W3C standard SHACL for modeling and offer a transformer from OWL to SHACL.

In this webinar we only showed a small percentage of EDG capabilities. There is a number of Export options ranging from exporting each graph or a set of graphs in RDF to highly focused exports of subset of graph information. The latter options are typically results of queries and are commonly exported as either JSON (using GraphQL) or tabular resultsets (using SPARQL).

Q9: Can temporal effectivity be stated on a term or relation?

Terms and most other EDG assets have pre-built metadata for effective start and end dates. These are defined in the ontologies underlying TopBraid EDG. EDG is fully model driven and additional properties can be defined by users as/if needed. Pre-built properties can be deactivated.

Adding temporal effectivity to a relationship requires reifying a relationship. EDG provides support for this as well.

Q10: There have been different senses of the term used, e.g., interchangeably with ‘ontology’. How would you describe a knowledge graph as distinct from an ontology? For example, when developing an ontology in TopBraid Composer or in Protege, one can have class-level term, but also instance-level. The so-called instance graph can be found in an ontology built in Protege by asserting instances and relations between the instances or b/w instances and classes (types). So how would you say (if at all) that ontology product different from a knowledge graph?

TopBraid EDG lets you work with ontologies (classes and properties) and data based on these ontologies. One of the key advantages of using RDF and associated languages for knowledge graphs is that models and rules are just as much a part of the knowledge graph as the data facts. They are not maintained separately in some programs. We advocate separating schema sub-graphs from data sub-graphs as a best practice – while keeping them connected in the overall knowledge graph. EDG facilitates, but not mandates this separation.

If you are, on the other hand, asking how EDG is different from Protégé, there are many differences. EDG is a highly scalable, enterprise-grade server product with role-based access control and workflows. EDG packages a scalable RDF store and stores data in RDF while Protégé does not and does not fully support RDF. EDG has a lot of features not present in Protégé – support for SHACL, GraphQL, data ingestion from many sources – to name just a few. Further, EDG focuses on governance of information and offers functionality targeted to governance use cases e.g., data lineage.

Q11: Can EDG be offered as a Cloud Service?

Yes, some of our customers run EDG on AWS, Azure or a private cloud.

Q12: How do taxonomies relate to or fit into knowledge graphs? How do ontologies related to or fit into knowledge graphs?

EDG treats all information it manages as part of a knowledge graph. We agree that some graphs are different from other graphs. This is why EDG supports the concept of an “asset collection type” or a “graph type”.

Taxonomies and Ontologies are types of graphs in EDG. EDG offers some specialized capabilities for certain graph types. For example:

• For data asset collections (AKA data dictionaries) there are import capabilities for ingesting DDL and for connecting to the source via JDBC to ingest metadata and do profiling. Such imports would not make sense for a taxonomy.
• For taxonomies, on the other hand, there is an import from MultiTes – a tool that is used for managing taxonomies. Such import would not make sense in a context of data assets.

Ontologies are quite special in EDG because they define the underlying models for data that is part of the knowledge graph. They are used to richly define the meaning of the data facts. An ontology may define that Geopolitical Regions contain Countries. Each country has one or more official languages and it has a single ISO alpha-2 country code. Countries have ‘capital’ relationship to Cities. And so on. A taxonomy may contain a hierarchical breakdown of regions, countries, and cities with some cities identified as the capital of each country. It can contain some other facts defined by the model – such as official languages for a country, its ISO code, etc. Both, the ontology and the taxonomy would be a part of a knowledge graph. Other information can also be a part of the knowledge graph – connecting and referring to the information in the ontology or taxonomy. For example, a dataset containing trading agreements between different countries.

Q13: Irene, can you demo a SPARQL query on the graph? (does it support geosparql?)

Sorry, we were not able to demo SPARQL in this session. You can request a private EDG demo by writing to edg-info@topquadrant.com. You can also request EDG evaluation account using this form.

EDG supports SPARQL 1.1. It also supports incorporation of property functions and ships with many pre-built property functions. Users can add their own property function implementations. We have not done functions for GeoSPARQL yet, but this is definitely possible.

Q14: Can EDG be hooked up to python or R?

Yes, it can be.

For example, R has a SPARQL Package allows you to directly import results of SPARQL SELECT queries into the statistical environment of R. EDG provides a SPARQL endpoint. Python also has tools for working with SPARQL. R and Phython programs could also query GraphQL. EDG can generate any kind of exports, so there are options.

Q15: I see the value of knowledge graphs, but my company is small and doesn’t have the financial ability to build our own knowledge graph, can you suggest some alternative methods/solutions?

Depends on what are you wanting to accomplish – the specific value that you see for your company.

Q16: Does TopBraid EDG has all RDF db store and editor similar to Protege and also an inferencer engine?

Yes, TopBraid EDG packages a scalable RDF database. Information in it can be edited using any modern browser.
During the webinar, we have been showing RDF data in tables and in forms. These are editable views, provided that a user has the right permissions. Any value can be edited and new values can be added. While we did not show in this webinar creation of classes, properties, etc., this capability is certainly there. If interested, you can take a look at one of our videos. Inferencing capabilities are also provided through rules and integration of machine learning.

Q17: Also what offerings from TopBraid are available as SAAS or PAAS models on the cloud as pay per use options?

TopBraid EDG can be hosted on a public cloud (e.g., AWS) or a private cloud. TopQuadrant offers either perpetual licenses or subscription licenses. Minimum duration of a subscription is 1 year.

Q18: Does the TopBraid Composer have any mechanism to determine that a specific TTL file is a glossary or taxonomy or crosswalk, etc? Or does it treat all the TTL files the same way?

TopBraid Composer does not distinguish between different graphs/files based on the content of a graph. It treats all files containing RDF serialized data the same way.

TopBraid EDG, on the other hand, has this distinction and provides different view/edit applications for different types of graphs (asset collections). It also provides, in some cases, different capabilities depending on the type of collection. For example:
• For data asset collections (AKA data dictionaries) there are import capabilities for ingesting DDL and for connecting to the source via JDBC to ingest metadata and do profiling. Such imports would not make sense for a taxonomy.
• For taxonomies, on the other hand, there is an import from MultiTes – a tool that is used for managing taxonomies. Such import would not make sense in a context of data assets.

Q19: Is there a good reason to store the data I have in a relational database in a graph database? When it becomes critical to have a graph instead of a relational database?

Yes, there are some good reasons to do so. Relational databases impose a very rigid data model that is expensive to evolve. One reason to move to a graph database would be because you need more flexibility than a relational database could provide. Another reason would be having highly networked relational data and needing for queries to extensively traverse the relationships. This works much better in a graph database than in a relational database. There are other reasons as well.

Recent Posts

  • Data Cataloging with Knowledge Graphs
  • Information governance 101: The regulatory compliance survival kit
  • Overview of TopBraid EDG Ontologies
  • Q4 2020 Newsletter
  • Automating the Mapping of Data Elements to Business Terms

Recent Comments

  • Brenton on Creating Web Services with the TopBraid Platform
  • naprzegladarkegry.blogspot.com on TopBraid Enterprise Vocabulary Net
  • Irene Polikoff on How to: Publish your Linked Data with TopBraid Live SPARQL Endpoints
  • Irene Polikoff on Ontologies and Data Models – are they the same?
  • Chino on How to: Publish your Linked Data with TopBraid Live SPARQL Endpoints

Archives

  • January 2021
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • March 2020
  • February 2020
  • January 2020
  • December 2019
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • December 2018
  • November 2018
  • October 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016
  • April 2016
  • February 2016
  • January 2016
  • December 2015
  • November 2015
  • October 2015
  • September 2015
  • August 2015
  • July 2015
  • June 2015
  • May 2015
  • April 2015
  • March 2015
  • February 2015
  • January 2015
  • December 2014
  • November 2014
  • October 2014
  • September 2014
  • August 2014
  • July 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • December 2013
  • November 2013
  • October 2013
  • July 2013
  • June 2013
  • May 2013
  • April 2013
  • March 2013
  • January 2013
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • February 2012
  • December 2011
  • November 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • November 2010
  • October 2010
  • August 2010
  • July 2010
  • May 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • November 2009
  • October 2009
  • September 2009
  • July 2009
  • June 2009

Categories

  • Uncategorized
  • The Semantic Ecosystems Journal
  • topbraid
  • life sciences
  • SPARQL
  • Ontologies
  • Semantic
  • metadata management
  • GDPR
  • GDPR
  • metadata management
  • Governance
  • GraphQL
  • Big Data
  • Linked Data
  • Reference Data Management
  • TopBraid RDM
  • Data Governance
  • ontology
  • collaborative ontology management
  • TopBraid EVN
  • TopBraid EDG
  • SHACL
  • TopBraid EDG
  • Data Lineage
  • FIBO
  • Financial Services
  • Data Lineage
  • FIBO
  • Linked Data
  • Linked Data
  • TopBraid EVN
  • SHACL
  • Ontologies
  • IDMP
  • TopBraid EVN
  • Financial Services
  • News
  • Mentions
  • Press Release
  • Newsletter
  • Events
  • Webinars
  • Knowledge Graphs
  • taxonomies
  • Data Cataloging
  • Business Glossaries

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
  • About Us
  • Privacy Statement
  • Legal
  • Terms of Use
  • Twitter
  • RSS
Copyright 2020 TopQuadrant, Inc. All Rights Reserved.