Skip to main content

What is Databricks?

Databricks is a cloud-based data platform that serves as a centralized hub where companies store, organize, and analyze large volumes of data. Running on AWS, Azure, or Google Cloud, Databricks brings together data engineering, SQL analytics, and machine learning in one place—making it a common system of record for operational, transactional, and financial data across industries like retail, healthcare, financial services, and manufacturing. Connecting Databricks to Parabola lets operations teams pull data directly from their company’s central data platform into automated workflows—without writing code or waiting on data engineering requests. Instead of manually exporting CSVs or building custom integrations, you can query your Databricks tables from Parabola, combine the results with data from other systems like Shopify, NetSuite, or your WMS, and push outputs wherever your team needs them—all automatically and on schedule.

Pull from Databricks

How to authenticate

The Pull from Databricks step uses personal access token authentication. You’ll need two things from your Databricks workspace: your workspace URL and a personal access token.

Find your workspace URL

Your workspace URL is the web address you see in your browser when you’re logged into Databricks. It looks different depending on your cloud provider:
  • AWS: https://dbc-a1b2345c-d6e7.cloud.databricks.com
  • Azure: https://adb-5555555555555555.19.azuredatabricks.net
  • GCP: https://1234567890123456.7.gcp.databricks.com
To find it, log into your Databricks workspace and copy the base URL from your browser’s address bar—everything before any ? or # characters.

Generate a personal access token in Databricks

1
In your Databricks workspace, click your username in the top bar and select Settings.
2
Click Developer in the left sidebar.
3
Next to Access tokens, click Manage.
4
Click Generate new token.
5
Enter a description (e.g., “Parabola integration”) and set a lifetime in days, then click Generate.
6
Copy the token and save it somewhere safe—it’s only displayed once. If you don’t see the option to create a token, ask your Databricks workspace admin to enable personal access tokens for your account.

Connect in Parabola

1
In your flow, add a Pull from Databricks step.
2
Click Authorize and enter your workspace URL and personal access token when prompted.
Parabola will securely store your credentials so you don’t need to re-authenticate each time.

Available data

The Pull from Databricks step lets you run any SQL query against your Databricks tables and pull the results directly into your Parabola flow. Because you write the query, you can access virtually anything stored in your Databricks workspace:
  • Any table or view in your data lakehouse: Query tables across catalogs and schemas using standard SQL—whether that’s order records, customer data, inventory levels, financial transactions, or anything else your data team has made available.
  • Aggregated and transformed data: Write queries with filters, joins, groupings, and calculations so you pull exactly the data you need rather than entire raw tables.
  • Saved queries and reports: Re-run any SQL query your analysts have already written to pull the same data they use for dashboards and reports.
When you configure the step, you’ll select which SQL Warehouse to run your query on. A SQL Warehouse is the compute engine in Databricks that processes your queries—think of it as the engine that does the work. Your workspace may have one or several warehouses available, and the step will show you the ones you have access to. If you’re not sure which to use, ask your data team which SQL Warehouse is set up for reporting or analytics queries.

Common use cases

  • Reconcile orders across systems by pulling order data from Databricks and comparing it against records in Shopify, your WMS, or your ERP to catch discrepancies, missing orders, or status mismatches before they impact customers.
  • Automate recurring operational reports by pulling key metrics from Databricks on a schedule—daily, weekly, or hourly—then formatting the results in Parabola and sending them to Slack, email, or Google Sheets without manual exports.
  • Monitor fulfillment and supply chain performance by combining Databricks data with carrier tracking information, warehouse records, or logistics platforms to flag SLA breaches, shipment delays, or inventory shortfalls in real time.
  • Power financial reconciliation workflows by pulling transaction data, revenue figures, or cost records from Databricks and matching them against invoices, purchase orders, or GL entries from your accounting system.
  • Build cross-platform dashboards by merging Databricks analytics data with live operational data from tools like NetSuite, ShipBob, Loop, or Gorgias to give your team a single, unified view of performance across systems.
  • Feed downstream tools with clean data by querying and transforming Databricks data in Parabola, then pushing the results into Google Sheets, Airtable, or any system your team works in daily.

Tips for using Parabola with Databricks

  • Write focused queries with filters to pull only the data you need. Adding WHERE clauses for date ranges or statuses keeps your flow fast and avoids pulling large amounts of historical data unnecessarily.
  • Schedule your flows strategically based on how often your data changes—run hourly for time-sensitive operational monitoring, daily for standard reporting, or weekly for performance summaries.
  • Ask your data team for help with the first query if you’re not sure which tables or columns to use. Once the SQL is set up in your Parabola flow, it runs automatically without any ongoing data team involvement.
  • Combine Databricks data with other sources in Parabola to create complete operational views. Pull inventory data from Databricks and merge it with Shopify orders, carrier tracking from UPS, or support tickets from Gorgias—all in one flow.
  • Find your SQL Warehouse ID in the Databricks sidebar by navigating to SQL Warehouses, clicking your warehouse name, and opening the Connection Details tab.
  • Note your personal access token’s expiration date and set a reminder to generate a new one before it runs out, so your flows don’t stop running unexpectedly.
That’s it! Once connected, you can pull any data from your Databricks lakehouse into Parabola and combine it with your other systems to automate the workflows that keep your operations running smoothly.
Last modified on March 5, 2026