Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale. You can use Databricks to process, store, clean, share, analyze, model, and monetize your datasets with solutions from BI to machine learning. With the Databricks platform, you can also build and deploy data engineering workflows, machine learning models, analytics dashboards, and more.
Direct integration with Databricks
With the direct integration of Blueshift and Databricks, importing customer data, catalogs, and real-time customer interaction data (events) into your Blueshift account is done easily and builds an up to date 360 degree view of all your customers.
Complete the prerequisites and then set up the direct integration with Databricks. After you have set up the integration, you can create a task to import customer data, catalogs, or events data into Blueshift.
Prerequisites
Before you can integrate with Blueshift, you must have your Databricks account set up.
Obtain the following information required to set up integration with Blueshift:
-
Workspace URL: This is the URL of your databricks workspace. You can find this in the address bar when you open your Databricks instance in the browser.
For more information, see the Databricks documentation for Identifiers for workspace objects. -
SQL Warehouse HTTP path: The HTTP path for the SQL Warehouse that you will run queries on. You can find the HTTP path by clicking SQL Warehouses in the left navigation, selecting the required warehouse, and then navigating to the Connection details tab.
For more information, see the Databricks documentation for Connection details for a SQL warehouse. - Catalog: The catalog in Databricks.
- Schema name: The name of the schema from which you want to import data.
-
Access token: Generate an access token for authentication. Copy the access token and save it.
Important: Note that if you do not copy and save the access token, you cannot access it again and must generate a new token.- The Access token must have permissions to access sql warehouse.
- The Access token must have permissions to access catalog, schema and should have read access on table/views in it.
For more information, see Databricks personal access tokens for workspace users.
Set up integration
To give Blueshift access to the data in Databricks, complete the following steps:
- Sign in to the Blueshift app, and click App Hub in the left navigation menu.
- Go to All Apps, search for, and select Databricks.
- You can also go to Data Warehouse Apps and select Databricks.
- Click Configure to view all the configured adapters.
- Click +ADAPTER to add an adapter.
- Add a Name for the adapter. If you have multiple adapters, the adapter name helps you to identify the integration.
- Provide the details of the data warehouse and the access token to access the data warehouse.
Workspace URL This is the URL of your databricks workspace. SQL Warehouse HTTP path The HTTP path for the SQL Warehouse that you will run queries on. Catalog: The catalog in Databricks. Schema name: The name of the schema from which you want to import data. Access token The access token for authentication. - Click Save.
Next steps
Create a task to import customer data, catalogs, or events data into Blueshift.
Comments
0 comments