Overview
The Timbr platform, combined with Databricks, offers a powerful solution for creating and utilizing scalable semantic graph data models in big data environments to efficiently integrate, manage, and share data across teams and business functions.
Timbr extends the Lakehouse architecture, adding semantics and relationships to the data and explicit business context.
While data complexity grows, current data architectures aren't enough to manage the ocean of data organizations store. There’s a clear gap for a system to keep data in order, maintain a clear business context associated with the datasets, and simplify discovery to help engineers and analysts avoid getting lost in the big blue sea of data.
The Semantic Lakehouse Architecture combines core capabilities from Databricks, Datalakes and Semantic Graphs. Those capabilities democratize access to data in a simple and explicit method for new or experienced data practitioners while maintaining all the business context related to the data.
Timbr allows users to create Semantic data models (ontologies) on top of Databricks. You can access the Semantic Data model directly in Timbr and also in Databricks.
The native integration with Databricks brings seamless integration to Databricks users, exposing the Semantic model directly in Unity Catalog / Hive metastore.
You can query Timbr business directly from your Databricks notebooks (using SQL/Python/R/Scala).
The following Semantic model in Timbr can be viewed and accessed directly in Databricks.
Timbr Semantic Model
Timbr Customer 360 query
SELECT `customer_name`, `customer_segment`,
`has_ordered[order].order_date` AS `order_date`,
`has_ordered[order].includes_product[product].product_name` AS `product_name`,
`has_ordered[order].includes_product[product].contains[material].supplier_name` AS `supplier_name`,
`has_ordered[order].includes_product[product].has_inventory[inventory].inventory_name` AS `inventory_name`,
`has_ordered[order].in_shipment[shipment].shipping_mode` AS `shipping_mode`,
`has_ordered[order].in_shipment[shipment].delivery_status` AS `delivery_status`
FROM `dtimbr`.`customer`
The explained Customer 360 query (the actual query that was sent to execute in Databricks):
From 8 SQL lines in Timbr (without JOINs or UNIONs) to 107 SQL lines (with 6 JOINs and 2 UNION statements). You can see the huge simplification once you've created your semantic model on top of Databricks.
Databricks Catalog Explorer with Timbr semantic schemas
The semantic model is also available directly in Databricks Unity Catalog / Hive metastore (once you follow the installation guide):