Configure the steps as described in the following section.
Connecting to Databricks
Last updated on 2025-11-07
Overview
You can establish a direct connection to Databricks to pull in data into xP&A.
This article contains a description of the prerequisites and the individual steps of the set-up process.
The fields that are pulled in must be defined during the set-up process using a database query. For a detailed instruction on how to structure such a query, see Defining Database Queries.
This article contains the following sections:
Prerequisites for the Setup
Whitelist IP Address in Databricks
Before connecting Databricks with xP&A, you have to whitelist the following IP address in your Databricks database:
- 52.59.129.235
Connecting to Databricks
To connect to Databricks:
Choose one of the following options:
- Open the Data workspace from the overview on the start page and click + New.
- Open the model into which you would like to integrate the data, click the + sign next to Data in the overview, and choose New data source:
The New Data Source dialog is displayed as follows:
'New Data Source' dialog for Databricks
Click Create data source.
Set-up Steps
Step
Description
Choose a connection
Choose an existing connection, or, if you have not configured a connection yet, click New Connection and enter the following in the New Databricks connection dialog:
- Host name of the Databricks cluster you want to connect
- Port of the Databricks cluster you want to connect
- HTTP path of the workspace/warehouse you want to connect
- Authentication type to be used when connecting to Databricks. Choose one of the following options:
- Personal account token: If you choose this option, copy the personal access token to be used to access Databricks into the Token field. (For more information on how to get the access token, see Authenticate with Databricks personal access tokens.)
- Machine-to-Machine OAuth: If you choose this option, enter the client ID and the Client secret used in the machine-to-machine authentication workflow with Databricks. (For more information on the client ID and client secret, see Authenticate access to Databricks using OAuth token federation.)
Complete the query form
Enter the following:
- Data Source Name
- Query to define the fields which are to be pulled in. For more information, see Defining Database Queries.
- Name of the Date column, which must be one of Databricks' date formats
- Names of the columns that contain variables (which must have a numeric data type)
Any remaining columns will be treated as dimensions, and must have a string data type.
An exception is the cohort dimension, which must be a date, with the column header explicitly labelled Cohort.