Querying the Knowlege Graph

Did you model your knowledge graph yet?

Before querying your knowledge graph, you must make sure its already modeled and mapped. If you haven't modeled your knowledge graph yet, its recommended you review first:

Summary of Timbr schemas

Timbr creates five virtual schemas to represent the ontology in a relational way. For more information about Timbr schemas please follow the links below:

timbr schema: Intrinsic definition - explicit ontology/relational mapping.
etimbr schema: Exhaustive definition - Timbr schema and implicit derived properties.
dtimbr schema: Dereferenced definition - etimbr schema with graph traversals.
vtimbr schema: A schema that expose views saved in a given ontology.
gtimbr schema: A schema to run Graph Algorithms on the ontology concepts.

The gtimbr schema is additional and is not included in the basic Timbr license.

Query Timbr schema

Used when no graph traversals or derived properties are required in the results. Direct and inherited properties of a concept are shown as table columns.

Who are all the soccer managers who were also soccer players?

The concept (represented as a virtual table) we are querying is the concept: timbr.person. We used the Timbr schema since we don’t do graph traversal or ask for properties defined by the concepts derived from person (like the concepts soccerplayer and soccermanager). The following properties (represented as columns) were used for this query:

entity_id – this is instance id (in our case the dbpedia id) of the person we are querying.
entity_label – represents the entity label (rdfs:label) of a person.
birthdate – represents the birthdate of a person.
entity_type – represents the entity type of a person (rdf:type) and we want to filter all the persons with type of soccermananger and soccerplayer.

Lastly we want to order by the birthdate in an ascending order.

The SQL query is as follows:

SELECT entity_id AS `Datasource Link`, entity_label AS `Full_Name`, birthdate
FROM `timbr`.`person`
WHERE entity_type = 'soccermanager,soccerplayer'
ORDER BY birthdate

The results are:

Querying the knowledge graph

Which universities have the highest number of faculty members?

The concept (represented as a virtual table) we are querying is the concept: timbr.university. We used the schema of Timbr since we don’t do graph traversal or ask for properties defined by the concepts derived from university concept. The following properties (represented as columns) were used for this query:

entity_label – represents the entity label (rdfs:label) of the university.
facultysize – represents the faculty size in a university.
numberofstudents – represents the number of students in a university.
numberofundergraduatestudents – represents the number of undergraduate students in a university.

Lastly we want to order by the facultysize in a descending order.

The SQL query is as follows:

SELECT entity_label AS `University`,
    facultysize AS `Faculty Size`,
    numberofstudents AS `Number of Students`,
    numberofundergraduatestudents AS `Number of Undergraduate Students`
FROM `timbr`.`university`
ORDER BY facultysize DESC

The results are:

Querying the knowledge graph

What types of persons and how many of each exist in the DBpedia ontology?

We use Apache Spark as our back-end and we can enjoy the native Apache Spark SQL functions like format_number(), The concept we are querying is the concept: timbr.person. We used the schema of Timbr since we don’t do graph traversal or ask for properties defined by the concepts derived from person concept. The following properties (represented as columns) were used for this query:

entity_type – represents the entity type of a person (rdf:type).
This column represents the type of each instance (row)
We order by the COUNT(1) (number of people) in descending order.

The SQL query is as follows:

SELECT format_number(COUNT(1), 0) AS `Number of People`,
    entity_type AS `Belong to Category (Concepts)`
FROM `timbr`.`person`
GROUP BY entity_type
ORDER BY COUNT(1) DESC

The results are:

Querying the knowledge graph

Query etimbr schema

Used when properties from derived concepts are needed. The schema is used when querying a concept requiring properties that some of its derived concepts.

What are the things having a slogan which contain the word 'Best'?

The concept we are querying is the concept: etimbr.thing. Every concept inherits from the concept thing. Therefore when we query the concept using the etimbr schema, we subsequently query all the concepts and properties in the ontology. In this case, we query any concept containing the property slogan We used the schema of etimbr because we query for properties defined by the concepts derived from thing (concepts such as broadcastnetwork, media, company, and more). The following properties (represented as columns) were used for this query:

entity_label – represents the entity label (rdfs:label) of the thing we are querying.
entity_type – represents the entity type of the thing we are querying (rdf:type).
slogan – since the concept thing in the schema etimbr contains all the properties of all the concepts, we can filter on the slogan property to query for all concepts containing the property of slogan and filter them by the ones containing the word ‘Best’.

The SQL query is as follows:

SELECT entity_label AS `Label`,
    entity_type AS `Category`,
    slogan AS `Slogan`
FROM `etimbr`.`thing`
WHERE `slogan` LIKE '%Best%'

The results are:

Querying the knowledge graph

What types of works contain the word 'Batman' (including the release date, ISBN, or IMDB-ID if available)?

The concept we are querying is the concept: etimbr.work When we query the concept work using the etimbr schema, we subsequently query all the concepts and properties which derived from work. In this case, we query the derived concepts from the concept work containing the properties:

releasedate – any derived concept with a property releasedate
imdbid – a property indicating the id in the IMDB website. This property usually applies to tv series and movies
isbn – which is a global identifier for books, novels, and other written works

The following properties (represented as columns) were used for this query:

entity_id – this is instance id (in our case the dbpedia id) of the work we are querying.
entity_label – represents the entity label (rdfs:label) of the work.
entity_type – represents the entity type (rdf:type) of the work.
releasedate – If a derived concept from work has a property of releasedate it will be shown, otherwise it will be null.
imdbid – If a derived concept from work has a property of imdbid it will be shown, otherwise it will be null.
isbn – If a derived concept from work has a property of isbn it will be shown, otherwise it will be null.

The SQL query is as follows:

SELECT entity_id AS `Datasource Link`,
    entity_label AS `Title`,
    entity_type AS `Category`,
    releasedate AS `Release Date`,
    imdbid AS `IMDB-ID`,
    isbn AS `ISBN`
FROM `etimbr`.`work`
WHERE entity_label LIKE '%Batman%'

The results are:

Querying the knowledge graph

What are all the things (any concept with a date) that happened on January 1st, 2000?

The concept we are querying is the concept: etimbr.thing. Every concept inherits from the concept thing. Therefore when we query the concept using the etimbr schema, we subsequently query all the concepts and properties in the ontology. In this case, we query any concept containing the property date. Note – as date is the property we want to query and also a reserved SQL word, we need to quote it with the ` sign (in our case ` is used for Apache Spark but it depends on the underlying engine) as `date`. as a best practice, its always good to encompass properties with the ` sign We used the schema of etimbr because we want any concept with the date property. The following properties were used for this query:

resource_id – this is the primary key we use in dbpedia (also represented as entity_id in the ontology) and represents the instance unique id (in our case the dbpedia id).
entity_label – represents the entity label (rdfs:label) of the thing we are querying.
entity_type – represents the entity type (rdf:type) of the thing we are querying.
date – any derived concept from thing that has a property (represented as a column) of date will return. Then we filter the date by requesting only the dates happening on the 1st of January 2000 in (ISO 8601 calendar date format, i.e. – “YYYY-MM-DD”).

The SQL query is as follows:

SELECT resource_id AS `Datasource Link`,
    entity_type AS `Category`,
    entity_label AS `Label`
FROM `etimbr`.`thing`
WHERE `date` = '2000-01-01'

The results are:

Querying the knowledge graph

Query dtimbr schema

Used when we want to perform graph traversals between concepts (JOIN the data between concepts) The schema is for queries involving other concepts which are connected through relationships.

What entities are owners of banks in Europe?

The concept (represented as a virtual table) we are querying is the concept: dtimbr.bank We used the schema of dtimbr since we want to use the relationship between the concept of a bank to a concept of a country and subsequently to a concept of a continent in our query. The following properties (represented as columns) were used for this query:

entity_label – represents the entity label (rdfs:label) of the bank we are querying.
`owner[agent].entity_label` – represents the entity label (rdfs:label) of the owner of the bank we are querying. owner is a reference (ObjectProperty) to agent concept which may be a person, a company or an organisation.
`country[country].entity_label` – represents the entity label (rdfs:label) of the country of the bank we are querying. country is a reference (ObjectProperty) to country concept. in dtimbr we can use the country reference as a path in the graph.
`country[country].continent[continent].entity_label` – represents the entity label (rdfs:label) of the continent of the bank's country we are querying. continent is a path that jumps two hops. first from the current concept bank to the concept of country, then from the concept of country to the continent concept, and we request the entity_label (rdfs:label) of that continent concept.

We filter the results by which the continent of the company must be Europe. Finally, we order our results by owner in ascending order.

The SQL query is as follows:

select entity_label as `Bank`, 
`owner[agent].entity_label` as `Owner`, 
`country[country].entity_label` as `Bank_Country`,
`country[country].continent[continent].entity_label` as `Bank_Continent`
from dtimbr.bank
where `country[country].continent[continent].entity_label` = 'Europe'
order by `Owner`

The results are:

Querying the knowledge graph

What companies in Europe or Asia have the highest operating income per employee?

We answered this query using a back-end of Apache Spark and therefore we enjoy native Apache Spark SQL functions like format_number(). The concept (virtual table) we are querying is the concept: dtimbr.company We used the schema of dtimbr since we want to use the relationship between the concept of a company to a concept of a country and subsequently to a concept of a continent in our query. The following properties (represented as columns) were used for this query:

entity_label – represents the entity label (rdfs:label) of the company we are querying.
operatingincome – represents the operatingincome of a company.
numberofemployees – represents the numberofemployees of a company.
`country[country].entity_label` – represents the entity label (rdfs:label) of the country of the company we are querying.
`country[country].continent[continent].entity_label` – represents the entity label (rdfs:label) of the continent of the company's country we are querying.

Note – we also select for a column which divides the operating income by the number of employees (operatingincome/numberofemployees) to answer our query

We filter the results by the following parameters:

the operatingincome of the company must be over 150,000,000 USD
the continent of the company must be Europe or Asia

Finally, we order our results by operatingincome in descending order.

The SQL query is as follows:

SELECT entity_label AS `Company Name`,
    format_number(operatingincome, 0) AS `Operating Income`,
    format_number(numberofemployees, 0) AS `Number of Employees`,
    format_number(operatingincome/numberofemployees, 0) AS `Operating Income per Employee`,
    `country[country].entity_label` AS `Country`,
    `country[country].continent[continent].entity_label` AS `Continent`
FROM `dtimbr`.`company`
WHERE operatingIncome > 150000000

The results are:

Querying the knowledge graph

What are all the companies that their parent company is owned by an organisation containing the word 'Warner'?

The concept we are querying is: dtimbr.organisation Company concept is derived from Organisation concept. Organisation concept is derived from Agent concept. Person concept is also derived from Agent concept. We used the schema of dtimbr since we want to use the relationship between the concept of an organisation to a concept of another organisation and subsequently to another concept of an agent in our query. The following properties (represented as columns) were used for this query:

entity_label – represents the entity label (rdfs:label) of the organisation we are querying.
`owningcompany[company].entity_label` – An inherited property of the concept organisation (which is inherited by the concept agent). represents the entity label (rdfs:label) of an agent concept which may be a person, a company, an organisation, or any other concept deriving from agent.
`owningcompany[company].owner[agent].entity_label` – An inherited property of the concept organisation (which is inherited by the concept agent). represents the entity label (rdfs:label) of the owner of the owningcompany. This owner of an owningcompany is an agent which may be a person, a company, an organisation, or any other concept deriving from agent.
entity_type – represents the entity type (rdf:type) of the organisation we are querying.
`country[country].entity_label` – represented a path in the graph to the country concept. we select the entity label (rdfs:label) of the country of the organisation we are querying.

We filter the results by the following parameters:

the continent of the company must be Europe

Finally, we order our results by owner in ascending order.

The SQL query is as follows:

select entity_label as `Company Name`,
`owningcompany[company].entity_label` as `Parent Company`,
`owningcompany[company].owner[agent].entity_label` as `Owner of Parent Company`,
entity_type as `Company Category`,
`country[country].entity_label` as `Company Country`
from dtimbr.organisation
where `owningcompany[company].owner[agent].entity_label` like '%Warner%'

The results are:

Querying the knowledge graph

Who are the many callers and the recipinets of those calls?

We answered this query using the concept (virtual table): dtimbr.person. We used the schema of dtimbr since we want to use the Many-to-Many relationship between the many people who made calls, to the many people who received those calls. The following properties (represented as columns) were used for this query:

caller[person].person_id – represents the unique id of people who made calls to others.
caller[person].first_name – represents the name of people who made calls.
callee[person].person_id – represents the unique id of people who received calls from others.
callee[person].first_name – represents the name of people who received calls.

The SQL query is as follows:

SELECT `caller[person].person_id` as  `caller_id`,
`caller[person].first_name` as `caller_name`,
`callee[person].person_id` as `callee_id`,
`callee[person].first_name` as `callee_name`
FROM dtimbr.person

The results are:

Querying the knowledge graph

Who are the subsidiary companies of Intel as well as thier transitivty levels?

We answered this query using the concept (virtual table): dtimbr.company. We used the schema of dtimbr since we want to use the transitive relationship between the parnet company Intel and its subsidiaries. The Transitive relationship is used in the square parentheses by choosing the concept we want and conecting it to a * followed by the number of levels of transitivity we would like to see. In this query it will show not only who is part of the Intel hierarchy but also at what level in the hierarchy each company is, showing us up to four levels of transitivity as we declared in the query. The following properties (represented as columns) were used for this query:

`organization_id` – represents the unique id given to the main owner company Intel.
`organization_name` – represents the name of the owner company Intel.
`purchased[company*4].organization_id` – represents the unique id of up to 4 levels of subsidiary companies of Intel.
`purchased[company*4].organization_name` – represents the name of up to 4 levels of subsidiary companies of Intel.
`purchased[company*4]_transitivity_level` – represents the transitivity_level (hierarchy level) to up to 4 levels of subsidiary companies of Intel.

We filter the results by the following parameters:

the owning organization name must be Intel.

The SQL query is as follows:

SELECT `organization_id` as owner_id,
`organization_name` as owner_name,
`purchased[company*4].organization_id` as subsidiary_id,
`purchased[company*4].organization_name` as subsidiary_name,
`purchased[company*4]_transitivity_level` as company_transitivity_level
FROM dtimbr.company
WHERE `organization_name` = 'Intel'

The results are:

Querying the knowledge graph

Querying a relationship property

In order to query a relationship property in the SQL editor or from any other endpoint, users must use the dtimbr schema which exposes relationships. Once loading a concept's metadata, the relationship properties will appear with "_" (underscore) before the property name indicating that this is a property of the many-to-many relationship and not the property of the target concept which is identified with a dot before the property name.

The syntax of a relationship property appears as follows:

<relationship>[<target_concept>]_<relationship property>

Querying multiple Knowledge Grpahs

Querying multiple Knowledge Graphs allows users to combine data from different data models (Ontologies) in a single query. This feature is especially useful in complex environments where data is distributed across various domains or datasets, enabling a holistic view and interaction between disparate data sources.

Querying Template

Required information for Querying multiple Knowledge Grpahs (the curly brackets {} should not be an input, they are used only as a variable substitution):

{ontology_name} - The ontology model of the knowldge graph you'd like to query
{timbr_schema} - The virtual Timbr schema you'd like to query
{concept_name/view_name} - The name of the concept or view you'd like to query

SELECT ... FROM {ontology_name}.{timbr_schema}.{concept_name/view_name}

Supported Timbr schemas: timbr, dtimbr, vtimbr.

Example using the timbr schema

The following example demonstrates how to query concepts from different ontologies using the timbr schema.

SELECT name FROM calls.timbr.person
UNION ALL
SELECT customer_name FROM supply_chain.timbr.customer

Example using the dtimbr schema

This example shows how to perform a query involving the dtimbr schema, allowing for graph traversals.

SELECT name FROM calls.dtimbr.person
WHERE `has_device[device].device_id` IN (
  SELECT `made_order[order].includes_product[product].product_id` FROM supply_chain.dtimbr.customer
)

Requirements

Single datasource, multiple knowledge graphs:

This setup works if the Knowledge Graphs share the same datasource (e.g., MySQL). If different datasources are queried simultaneously, one of the sources (which is not being used in the query) will return null values (not an error), indicating no data available from that particular source.

Multiple datasources, multiple knowledge graphs:

When using a virtualization datasource (such as Spark or Databricks), it's possible to query multiple Knowledge Graphs and datasources, and recieve consolidated results, provided the user has the necessary access permissions to the different datasources and knowldge graph models.

ontology view

It’s important to mention that when querying multiple ontologies, users can create an ontology view that queries multiple ontologies.

Querying the Knowlege Graph

Summary of Timbr schemas​

Query Timbr schema​

Query etimbr schema​

Query dtimbr schema​

Querying multiple Knowledge Grpahs​

Summary of Timbr schemas

Query Timbr schema

Query etimbr schema

Query dtimbr schema

Querying multiple Knowledge Grpahs