Skip to main content

Overview

The Timbr platform, combined with Databricks, offers a powerful solution for creating and utilizing scalable semantic graph data models in big data environments to efficiently integrate, manage, and share data across teams and business functions.

Timbr extends the Lakehouse architecture, adding semantics and relationships to the data and explicit business context.

While data complexity grows, current data architectures aren't enough to manage the ocean of data organizations store. There’s a clear gap for a system to keep data in order, maintain a clear business context associated with the datasets, and simplify discovery to help engineers and analysts avoid getting lost in the big blue sea of data.

The Semantic Lakehouse Architecture combines core capabilities from Databricks, Datalakes and Semantic Graphs. Those capabilities democratize access to data in a simple and explicit method for new or experienced data practitioners while maintaining all the business context related to the data.

Semantic Lakehouse Architecture

Timbr allows users to create Semantic data models (ontologies) on top of Databricks. You can access the Semantic Data model directly in Timbr and also in Databricks.

The native integration with Databricks brings seamless integration to Databricks users, exposing the Semantic model directly in Unity Catalog / Hive metastore.

You can query Timbr business directly from your Databricks notebooks (using SQL/Python/R/Scala).

The following Semantic model in Timbr can be viewed and accessed directly in Databricks.

Timbr Semantic Model

Timbr Supply chain ontology

Timbr Customer 360 query

SELECT `customer_name`, `customer_segment`, 
`has_ordered[order].order_date` AS `order_date`,
`has_ordered[order].includes_product[product].product_name` AS `product_name`,
`has_ordered[order].includes_product[product].contains[material].supplier_name` AS `supplier_name`,
`has_ordered[order].includes_product[product].has_inventory[inventory].inventory_name` AS `inventory_name`,
`has_ordered[order].in_shipment[shipment].shipping_mode` AS `shipping_mode`,
`has_ordered[order].in_shipment[shipment].delivery_status` AS `delivery_status`
FROM `dtimbr`.`customer`

The explained Customer 360 query (the actual query that was sent to execute in Databricks):

Timbr Customer 360 query explained

From 8 SQL lines in Timbr (without JOINs or UNIONs) to 107 SQL lines (with 6 JOINs and 2 UNION statements). You can see the huge simplification once you've created your semantic model on top of Databricks.

Databricks Catalog Explorer with Timbr semantic schemas

The semantic model is also available directly in Databricks Unity Catalog / Hive metastore (once you follow the installation guide):

Timbr in Unity Catalog

Databricks SQL using Timbr SQL statements

Alt text

Databricks Customer 360 query SQL on Timbr concepts

Customer 360 query

Databricks Python query on Timbr concepts

Shipment query using Python