Skip to main content

Install

Timbr Integrations with Databricks:

  1. You can configure Databricks in Timbr as a datsource (with optional virtualization).
  2. You can install Timbr inside Unity Catalog / Hive metastore to model and query your ontology.

Create Databricks datasource in Timbr

  • Go to your Databricks Cluster configuration

Under Advanced Options -> select JDBC/ODBC -> Copy the JDBC URL:

jdbc:databricks://<hostname>:443/default;transportMode=http;ssl=1;httpPath=<http_path>;AuthMech=3;UID=token;

In your Timbr environment under Manage -> Datasources click on "Add new Datasource"

Ontology example

Fill in the following parameters:

  1. Datasource name
  2. Hostname: your Databricks hostname
  3. Port (default 443)
  4. Username: token
  5. Password: your Databricks personal access token
  6. Database name: default (optional)
  7. Additional Parameters: transportMode=http;ssl=1;httpPath=<http_path>;AuthMech=3;UID=token

Another option is the JDBC Url option in the "Add Datasource" window.

Copy and paste the Databricks JDBC:

Ontology example

Two additional optional configuration options:

  1. Description - describe the purpose/usage of the datasource / cluster
  2. Active Virtualization - when enabling this option, Timbr will use the Databricks cluster to virtualize and join all other datasources defined in the ontology.

Install Timbr in your Databricks cluster

To install Timbr in your Databricks cluster, you need to setup the Timbr "init script" in your cluster.

You can find more information about Databricks init scripts at: https://docs.databricks.com/en/init-scripts

When installing Timbr using the init script, you can configure cluster-scoped init scripts and global init scripts. This means you can configure Timbr in a specific cluster (cluster-scoped) or by default on every cluster created in your workspace.

Timbr init script:

The Timbr init script installs the Timbr jar into your Databricks cluster. This enables Timbr to expose the ontology as part of the Unity Catalog / Hive metastore. It also allows users to use Timbr ontology DDL statements to create and manage the ontology.

The Timbr init script can be configured directly from your Databricks Workspace/DBFS/Volumes or from S3/ABFSS You can create it and set it up by yourself or use one from Timbr S3/ABFSS (contact Timbr support to get access).

The Timbr init script:

#!/bin/bash
mkdir -p /dbfs/FileStore/timbr/
curl -o /dbfs/FileStore/timbr/timbrsparkparser.jar "https://<timbr_url>/timbrsparkparser.jar"
cp /dbfs/FileStore/timbr/timbrsparkparser.jar /databricks/jars

The init script job is to copy the Timbr jar into Databricks jars.

To set it up yourself, go to your Databricks cluster configuration Advanced Options and select "Init Scripts".

ABFSS Example

DBFS Example

For examples to setup the init script, you can follow the Databricks documentation (https://docs.databricks.com/en/init-scripts/global.html#add-a-global-init-script-using-the-ui) to setup the init-script globally.

Configure Timbr in Databricks

Once you've setup the init script, you can configure Timbr using the following Spark configuration (under Advanced options - Spark config):

ConfigNameOptionsDescription
spark.sql.extensionstimbr.spark.TimbrSparkSessionMandatory to enable Timbr in your Databricks cluster
spark.timbr.parsetrue/false (default: true)Enable/disable the Timbr SQL parser
spark.timbr.url<timbr_hostname>:<timbr_port>The hostname and port of your Timbr environment (port 443 for SSL, 80/11000 without SSL)
spark.timbr.ssltrue/false (default: false)Connect to Timbr using SSL
spark.timbr.ontology<ontology_name>The ontology you want to connect
spark.timbr.usertokenThe username to access the Timbr platform
spark.timbr.passwordtk_12345678The Timbr token of the user
spark.timbr.ssotrue/false (default: false)Run queries in Timbr on behalf of the user authenticated in Databricks (the token used as password should be set with special auth permissions in Timbr)
spark.sql.catalog.spark_catalogtimbr.spark.TimbrCatalogOptional configuration to enable Timbr under Hive metastore

Once you configure Timbr, you can start querying your ontology directly from Databricks notebooks and explore the metadata in the Databricks Catalog explorer.