FAQ
What is Timbr?
Timbr enables integration of data sources by connecting them to a semantic data model that you create, powered with relationships. The modeled data can be accessed in no JOINs SQL, Spark, Phyton, Scala, Java and R, or can be consumed with business intelligence tools via ODBC/JDBC, data science tools and web applications via REST. Modeled data can be graphically explored as a network and can also be analyzed with a library of graph algorithms run with simple SQL queries.
Timbr makes it fast and easy to connect any datastore, and securely share what you’ve built with other users – all in one place.
Who can use Timbr?
Anyone with knowledge of SQL can use Timbr. It is designed for data analysts, engineers, and scientists who want to work with knowledge graphs. But also any data cunsomer of the organization benefits from the power of the knowledge graph.
How does Timbr work?
Timbr works by allowing users to model their data as a knowledge graph, and then query that graph using SQL-like syntax. Timbr supports various graph data models and exposes them as ontologies.
What are the requirements to deploy Timbr?
- Computing power depending on the application size:
- Small-Medium: 4 CPU 16 GB RAM server.
- Large : 8 CPU 32 GB RAM server.
- Deployment options (Docker or Kubernetes): Docker/Linux: Linux image or automatic deployment via docker-compose can be installed on any Linux server (you can extend our YAML to add your security protocols, configurations, and customizations).
- Platform requirements:
- List of databases with type and version information to validate connectors.
- Timbr MySQL metadata DB (mandatory – configurable as a container or managed externally).
Can I use Timbr to integrate data from different sources?
Yes, Timbr can integrate data from various sources, including databases, CSV files, and Web APIs. Users can model the integrated data as a knowledge graph and query it using SQL.
What interfaces are available to connect with Timbr?
Timbr connects to all popular data lakes, databases, BI tools, data science tools and notebooks, as well as various applications (APIs).
Once connected, the data can be queried in SQL, Python/R, dataframes, and natively in Apache Spark (SQL, Python, R, Java, Scala). GraphQL can be supported by integrating external open source projects that support the translation of GraphQL to SQL.
Can I use Timbr to visualize my knowledge graph data?
Yes, Timbr provides visualization tools that enable users to visualize their knowledge graph data and explore its structure. Timbr can also connect to all the popular BI tools, or to be used as a REST API to integrate with other programming languages.
Can I share my Timbr projects with others?
Yes, users can share their Timbr projects with others, either by granting them access to their Timbr account or by exporting their data or embedding their graphs.
Does Timbr support machine learning?
Yes, Timbr provides integration with popular machine learning libraries like TensorFlow and PyTorch, by using SLQAlchemy. Enabling users to enrich their machine learning projects with knowledge graphs.
Is Timbr scalable?
Yes, Timbr is designed to be scalable and can handle large amounts of data. Timbr provides distributed query processing and supports parallel execution of queries.
Is Timbr secure?
Yes, Timbr is designed with security in mind and provides features like encryption, access control, and auditing to ensure the security of users' data.
Does Timbr work with graph algorithms?
Yes. Timbr’s default implementation for graph algorithms is networkX and it happens automatically, meaning that when a user writes an SQL query Timbr automatically runs the algorithm behind the scenes. Timbr also supports Nvidia’s Cugraph (GPU) enabling graph algorithms with advanced performance.
Can Timbr export ontologies as ERDs?
Yes, any tool that can create an ERD from a JDBC connection can also create an ERD from a Timbr ontology.
How does Timbr enable digital twins?
Timbr helps enterprises create digital twins by enabling the definition of the virtual model using SQL ontologies and by connecting the virtual model to data lakes that contain the sensor’s data.
Does Timbr integrate with data catalogs?
Timbr can leverage any data catalogs into a knowledge catalog of the business, automatically generating the semantic model. Timbr can also work with the data catalog’s business glossary and the data mappings.
What is the correct way to query Timbr?
In Timbr, you query concepts the same way you would query tables, so you can use the underlying SQL engines functions seamlessly. Timbr as in many SQL engine has a list of reserved keywords. So it is important to quote identifiers (column name and concept name) matching the underlying SQL Engine.
How does Timbr connect to business intelligence tools?
Timbr supports both JDBC and ODBC. We reuse the thrift-server protocol of Apache Hive and Spark. This means you can connect to Timbr’s Knowledge Graph using Hive/Spark JDBC/ODBC drivers (in most BI tools they already come embedded, so no installation needed).
Can you use GraphQL with Timbr?
Yes, GraphQL is supported by integrating external open source projects that support the translation of GraphQL to SQL.
Does Timbr support JSON and XML?
Timbr supports query of JSON and XML using Apache Spark.
Is it possible for SPARQL users to convert directly to Timbr’s SQL?
Yes, moving from SPARQL to Timbr’s simplified SQL is quite trivial and easy to do.
Can python users connect to Timbr using SQLAlchemy?
Yes, Timbr works extensively with SQLAlchemy. Another valid option for python users is DataFrames.
Is Timbr compatible with OWL-DL and OWL-2 inferences? Is there an option to add more inferences?
Yes, Timbr is compatible with OWL-DL and some OWL-2 inferences. If there is a clear business value to add more OWL-2 inferences, we can support them as well. Timbr’s inference engine is based on query-rewriting techniques. If Timbr encounters slow queries/performance, Timbr can specifically materialize the part of knowledge that is required.
Does Timbr work in a Hybrid/multi-Cloud environment?
Yes. Timbr provides a comprehensive solution to integrate multiple databases located in varied locations. In terms of deployment, Timbr is deployed in Kubernetes or Docker at the user’s choices. Timbr also supports multi-cluster deployments so users can deploy Timbr on Azure, GC or AWS. In general, Timbr recommends cloud because of the managed services, though Timbr can also run-on premise. The user can decide whether to run the queries locally on-premise or on the cloud to benefit from Timbr’s multi-cluster deployment.
Can rules be used to filter data when mapping databases to SQL ontologies?
Timbr supports applying rules to concepts to classify the data or embed business logic in the ontology. For Example: Adult: Person where age > 21 ExpensiveProduct: Product where price > 1000.
How do you specify physical keys to be used for joining data?
Timbr allows creating virtual PKs for concepts (used as unique identifiers), and FK to PKs in the ontology (used as relationships between concepts). As long as the ontology author maps the physical tables PKs to the ontology PKs, client join will follow these declarations. In the ontology, you can create relationships between concepts using FK statements. In each relationship, you specify the properties in the ontology that represent the relationship (used for the JOIN).
Is there any option to build or modify an ontology programmatically?
Yes, Timbr is accessible in JDBC/ODBC and the ontology can be created programmatically using Timbr SQL DDL statements:
CREATE CONCEPT (extension of CREATE TABLE statement)
CREATE MAPPING (extension of CREATE VIEW statement)
In many cases, we build small scripts to generate parts of the ontology programmatically.
What is the ontology definition format (binary, XML, etc.)?
The ontology definition is in SQL DDL statements and it is stored in Timbr’s internal metadata DB.
You can access the ontology definitions using Timbr system tables: SYS_CONCEPTS, SYS_RELATIONSHIPS, SYS_PROPERTIES, SYS_ONTOLOGY, SYS_MAPPINGS, etc.
What is the approach for promoting content from test to prod environment? Does it require platform’s downtime?
No, any change to the ontology is immediately reflected to all the users. This means no downtime.
We plan to support GIT-like behavior, so ontologies can be deployed in a similar way to code.
What is the correct way to do graph traversals in dtimbr?
Every path you write in a query in dtimbr you must surround with quotes the entire path (the quote character is based on the underlying SQL engine).
The pattern of the most basic relationship is <relationship_name>[<target_concept_name>].<property_name>
For example: You have in your knowledge graph a concept named person with a relationship named works_at to another concept named company.
If you want to get:
- the name of the person
- the name of the company where the person works
You could write:
SELECT
name,
works_at[company].name
FROM dtimbr.person
You can also chain relationships to traverse more concepts.
For example: Following from the previous example, lets say the concept company has a relationship named located_in to another concept named city
If you want to get:
- the name of the person
- the name of the company where the person works
- the name of the city where the company is located
You could write:
SELECT
name,
works_at[company].name,
works_at[company].located_in[city].name
FROM dtimbr.person
For more information check Creating relationships in timbr
What happens when I write select *
from dtimbr
concept?
When you use the command SELECT * from a dtimbr
concept, Timbr will initially retrieve only the properties of the specified concept. It does not perform JOINS with other concepts unless explicitly required by the query's path. This approach helps to prevent the backend SQL engine from executing potentially large and unnecessary JOINS. However, this behavior is configurable. Users can adjust settings to include relationships in the results if needed, tailoring the query execution to better fit their preferences and requirements.