Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale. You can use Databricks to process, store, clean, share, analyze, model, and monetize your datasets with solutions ranging from BI to machine learning. Connect Databricks with Blueshift to sync customer data, catalogs, events, and campaign reports.
Integration capabilities
| Capability | Direct integration |
|---|---|
| Import customer data, events, and catalogs | ✓ |
| Export campaign reports | ✓ |
| Incremental import (Delta tables) | ✓ |
| Real-time sync | ✓ |
Direct integration with Databricks
With direct integration, Blueshift can import customer data, catalogs, and real-time events to build a 360-degree view of your customers. You can also export campaign activity reports to Databricks and analyze them with BI tools.
 Prerequisites - before you begin
- Set up your Databricks account and SQL warehouse.
- Ensure you have the required privileges to access the catalog, schema, and tables/views.
- Generate a personal access token with read access to the data you plan to import.
- Collect:
- Workspace URL
- SQL Warehouse HTTP Path
- Catalog and Schema name
- Access token
 Warehouse configuration recommendations
-
Use Serverless SQL Warehouses
We recommend serverless warehouses for faster startup times and improved cost efficiency with asynchronous workflows. Learn about serverless types. -
Configure Auto-stop
Set an idle timeout (Auto-stop) to automatically terminate the warehouse when connections and jobs are complete. Learn how to configure Auto-stop.
Set up integration
To give Blueshift access to the data in Databricks, complete the following steps:
Step 1: Create an adapter in Blueshift
- Sign in to the Blueshift app and click App Hub in the left navigation menu.
- Go to All Apps, search for, and select Databricks.
- You can also go to Data Warehouse Apps and select Databricks.
- Click Configure to view all configured adapters.
- Click +ADAPTER to add an adapter.
- Add a Name for the adapter. If you have multiple adapters, the name helps you identify the integration.
Step 2: Configure access & provide warehouse details
Provide the Databricks connection details required for Blueshift to access the data.
| Field | Details |
|---|---|
| Workspace URL | The URL of your Databricks workspace. |
| SQL Warehouse HTTP path | The HTTP path for the SQL Warehouse that you will run queries on. |
| Catalog | The catalog in Databricks. |
| Schema name | The name of the schema from which you want to import data. |
| Access token | The access token for authentication. |
Step 3: Authenticate & save
- Paste your access token in the adapter configuration.
- Click Check Access Status to verify connectivity.
- Once verified, click Save to complete setup.
 Warehouse must be running
To verify connectivity, your SQL warehouse must be in a Running state.
Note: This is a one-time setup requirement. For subsequent imports and exports, Blueshift automatically starts the warehouse via API. This ensures your workflows run smoothly even if auto-shutdown is enabled.
For instructions on starting and managing your warehouse, see the Databricks documentation.
Import data
Complete the prerequisites and set up the direct integration first. After the setup is complete, you can create a task to import data into Blueshift.
Once you have set up the integration, you can start importing the recommendation feeds, customer data, catalogs, and event data.
Export data
Export campaign activity reports from Blueshift to Databricks to analyze campaign performance using BI or advanced analytics tools.
 Exporting data via Databricks
Configure exports from Account Settings → Campaign Activity Export. Once configured, your campaign reports will be delivered directly to your Databricks warehouse.
 Data imports and exports
Blueshift supports importing and exporting customer, catalog, event, and campaign activity data through integrations with data warehouses, including BigQuery, Snowflake, and Databricks. For a comprehensive overview of supported data types and connection options, refer to Data in Blueshift.
Comments
0 comments