What is Databricks?
Databricks is a cloud-based data platform that serves as a centralized hub where companies store, organize, and analyze large volumes of data. Running on AWS, Azure, or Google Cloud, Databricks brings together data engineering, SQL analytics, and machine learning in one place—making it a common system of record for operational, transactional, and financial data across industries like retail, healthcare, financial services, and manufacturing. Connecting Databricks to Parabola lets operations teams pull data directly from their company’s central data platform into automated workflows—without writing code or waiting on data engineering requests. Instead of manually exporting CSVs or building custom integrations, you can query your Databricks tables from Parabola, combine the results with data from other systems like Shopify, NetSuite, or your WMS, and push outputs wherever your team needs them—all automatically and on schedule.Pull from Databricks
How to authenticate
The Pull from Databricks step uses personal access token authentication. You’ll need two things from your Databricks workspace: your workspace URL and a personal access token.Find your workspace URL
Your workspace URL is the web address you see in your browser when you’re logged into Databricks. It looks different depending on your cloud provider:- AWS:
https://dbc-a1b2345c-d6e7.cloud.databricks.com - Azure:
https://adb-5555555555555555.19.azuredatabricks.net - GCP:
https://1234567890123456.7.gcp.databricks.com
? or # characters.
Generate a personal access token in Databricks
Enter a description (e.g., “Parabola integration”) and set a lifetime in days, then click Generate.
Connect in Parabola
Parabola will securely store your credentials so you don’t need to re-authenticate each time.
Available data
The Pull from Databricks step lets you run any SQL query against your Databricks tables and pull the results directly into your Parabola flow. Because you write the query, you can access virtually anything stored in your Databricks workspace:- Any table or view in your data lakehouse: Query tables across catalogs and schemas using standard SQL—whether that’s order records, customer data, inventory levels, financial transactions, or anything else your data team has made available.
- Aggregated and transformed data: Write queries with filters, joins, groupings, and calculations so you pull exactly the data you need rather than entire raw tables.
- Saved queries and reports: Re-run any SQL query your analysts have already written to pull the same data they use for dashboards and reports.
Common use cases
- Reconcile orders across systems by pulling order data from Databricks and comparing it against records in Shopify, your WMS, or your ERP to catch discrepancies, missing orders, or status mismatches before they impact customers.
- Automate recurring operational reports by pulling key metrics from Databricks on a schedule—daily, weekly, or hourly—then formatting the results in Parabola and sending them to Slack, email, or Google Sheets without manual exports.
- Monitor fulfillment and supply chain performance by combining Databricks data with carrier tracking information, warehouse records, or logistics platforms to flag SLA breaches, shipment delays, or inventory shortfalls in real time.
- Power financial reconciliation workflows by pulling transaction data, revenue figures, or cost records from Databricks and matching them against invoices, purchase orders, or GL entries from your accounting system.
- Build cross-platform dashboards by merging Databricks analytics data with live operational data from tools like NetSuite, ShipBob, Loop, or Gorgias to give your team a single, unified view of performance across systems.
- Feed downstream tools with clean data by querying and transforming Databricks data in Parabola, then pushing the results into Google Sheets, Airtable, or any system your team works in daily.
Tips for using Parabola with Databricks
- Write focused queries with filters to pull only the data you need. Adding
WHEREclauses for date ranges or statuses keeps your flow fast and avoids pulling large amounts of historical data unnecessarily. - Schedule your flows strategically based on how often your data changes—run hourly for time-sensitive operational monitoring, daily for standard reporting, or weekly for performance summaries.
- Ask your data team for help with the first query if you’re not sure which tables or columns to use. Once the SQL is set up in your Parabola flow, it runs automatically without any ongoing data team involvement.
- Combine Databricks data with other sources in Parabola to create complete operational views. Pull inventory data from Databricks and merge it with Shopify orders, carrier tracking from UPS, or support tickets from Gorgias—all in one flow.
- Find your SQL Warehouse ID in the Databricks sidebar by navigating to SQL Warehouses, clicking your warehouse name, and opening the Connection Details tab.
- Note your personal access token’s expiration date and set a reminder to generate a new one before it runs out, so your flows don’t stop running unexpectedly.