Skip to main content

Documentation Index

Fetch the complete documentation index at: https://parabola.io/docs/llms.txt

Use this file to discover all available pages before exploring further.

Databricks is a cloud data lakehouse that runs on AWS, Azure, or GCP and serves as a central store for operational, transactional, and financial data at companies in retail, healthcare, financial services, and manufacturing. Connecting Databricks to Parabola lets ops, finance, and supply-chain teams query lakehouse tables directly and feed the results into automated workflows, without writing code or filing a ticket with data engineering.

Pull from Databricks

The Pull from Databricks step runs a SQL query against your Databricks SQL Warehouse and loads the result into your flow. You write the query, pick the warehouse to run it on, and Parabola handles the rest.

How to authenticate

The step uses personal access token authentication. You need two things from your Databricks workspace: your workspace URL and a personal access token.

Find your workspace URL

Your workspace URL is the address you see in the browser when you’re logged into Databricks. The format depends on your cloud provider:
  • AWS: https://dbc-a1b2345c-d6e7.cloud.databricks.com
  • Azure: https://adb-5555555555555555.19.azuredatabricks.net
  • GCP: https://1234567890123456.7.gcp.databricks.com
Copy the base URL — everything before any ? or # characters.

Generate a personal access token in Databricks

1
In your Databricks workspace, click your username in the top bar and select Settings.
2
Click Developer in the left sidebar.
3
Next to Access tokens, click Manage.
4
Click Generate new token.
5
Enter a description (e.g., “Parabola integration”) and set a lifetime in days, then click Generate.
6
Copy the token and save it somewhere safe — it’s only displayed once. If you don’t see the option to create a token, ask your Databricks workspace admin to enable personal access tokens for your account.

Connect in Parabola

1
In your flow, add a Pull from Databricks step.
2
Click Authorize and enter your workspace URL and personal access token when prompted.
3
Pick the SQL Warehouse you want to run your query against. The dropdown shows the warehouses your token has access to. If you’re not sure which to use, ask your data team which one is set up for reporting queries.
Parabola stores your credentials securely so you don’t need to re-authenticate on every run.

Available data

The Pull from Databricks step lets you run any SQL query against your Databricks tables and pull the results into your flow. Because you write the query, you can access anything stored in the workspace your token can read:
  • Any table or view in your data lakehouse — query tables across catalogs and schemas using standard SQL, whether that’s order records, customer data, inventory levels, financial transactions, or anything else your data team has made available.
  • Aggregated and transformed data — write queries with filters, joins, groupings, and calculations so you pull exactly the data you need rather than full raw tables.
  • Saved analyst queries — re-run a query your data team has already written for a dashboard, then route the same output through Parabola to a new destination.
A SQL Warehouse is the compute engine in Databricks that processes your query — think of it as the engine that does the work. Your workspace may have one or several warehouses, and the step will show you the ones you have access to.

Common use cases

  • Reconcile orders across systems: Pull order data from Databricks and compare it against Shopify, Amazon Seller Central, or NetSuite to flag missing orders, status mismatches, or revenue gaps before they hit customers.
  • Automate recurring ops reports: Pull key metrics from Databricks on a schedule, format the result in Parabola, and route it to Slack, Google Drive, or Smartsheet without manual exports.
  • Monitor fulfillment and supply-chain performance: Combine Databricks data with carrier records from FedEx, UPS, or ShipStation, or with WMS records from ShipBob or ShipHero, to flag SLA breaches and inventory shortfalls.
  • Power financial reconciliation: Pull transaction data, revenue figures, or cost records from Databricks and match them against invoices, purchase orders, or GL entries from QuickBooks Online, NetSuite, or SuiteQL.
  • Build cross-platform dashboards: Merge Databricks analytics with live operational data from Gorgias, Zendesk, or HubSpot so teams work from a single view.
  • Push clean data to working tools: Query and transform Databricks data in Parabola, then push the result into Airtable, Smartsheet, or Google Drive for the team that actually uses it.
  • Sync to other warehouses: Move clean, filtered slices of Databricks data into Snowflake, BigQuery, or Redshift for downstream BI without copying full tables.

Tips for using Parabola with Databricks

  • Write focused queries with filters. Add WHERE clauses for date ranges or statuses to pull only the rows you need. Avoid SELECT * on multi-million row tables.
  • Match cadence to use case. Hourly for time-sensitive ops monitoring, daily for standard reporting, weekly for performance summaries.
  • Ask your data team for the first query. Once the SQL is in your flow, it runs on its own with no further data team involvement, so the up-front help pays off.
  • Combine Databricks with other sources. Pull inventory from Databricks and merge it with Shopify orders, UPS tracking, or Gorgias tickets in the same flow.
  • Find your SQL Warehouse ID in the Databricks sidebar by going to SQL Warehouses, clicking the warehouse name, and opening the Connection Details tab.
  • Track token expiration. Personal access tokens expire on the lifetime you set. Note the date and rotate before it lapses so flows don’t fail silently.

FAQ

Can I push data back into Databricks?

The Pull from Databricks step is read-only. To write to a Databricks table, use a Send to an API step pointed at the Databricks SQL Statement Execution API with your personal access token, or push the output into cloud storage that your Databricks job ingests.

Which clouds are supported?

All three: AWS, Azure, and GCP. The workspace URL format differs by cloud (see above), but the auth and step configuration are the same.

Why is my query timing out?

SQL Warehouses can be paused when idle and take a minute to spin back up. If your query is slow, check the warehouse’s auto-stop setting with your data team, or pick a larger warehouse for queries that scan large tables. Also confirm you’re filtering on partitioned columns where applicable.

Do I need Unity Catalog?

No. The step works with both Unity Catalog and the Hive metastore. As long as your SQL Warehouse can resolve the table reference in your query, Parabola can pull the result.

Can I use service principals instead of a personal access token?

Today the step authenticates with personal access tokens. For a service-account-style setup, generate the token under a dedicated user in your Databricks workspace and rotate it on a schedule.
With Databricks and Parabola connected, the operational reports, reconciliations, and alerts that depend on lakehouse data run on a schedule, with output landing in the systems where your team actually works.
Last modified on May 18, 2026