View All Docs
Hide Navigation
Product overview
Account overview
Integrations
Transforms
Security
Integrations   ->

Pull from email attachment

The Pull from email attachment step gives you the ability to receive file attachments (CSV, XLS, PDF, or JSON files) from an incoming email and pass it to the next step. The step also gives you the ability to pull an email subject and body into a Parabola Flow. Use this unique step to trigger Flows, using content from the email itself.

Note: PDF file support is currently offered to users on our Advanced Plan. Check out the Pricing Page for additional information.

Default attachment settings

To begin, take note of the generated email address that is unique to this specific flow. Copy the email address to your clipboard to start using this dedicated email address yourself or to share with others.

The File Type is set to CSV / TSV, though you can also receive XLS / XLSX, PDF, or JSON files.

The Delimiter is set to comma (,), but can also be adjusted to tab (\t) and semicolon (;). If needed, the default of Quote Character set to Double quote ( " " ) can be changed to single quote ( ' ' ).

Custom settings

This step contains optional Advanced settings, where you can tell Parabola to skip a certain number of rows or columns when receiving the attached file.

Auto-forwarding to a Parabola flow

To auto-forward a CSV attachment to an email outside of your domain, you may need to verify the @inbound.parabola.io email address. The below example shows how to set this up in Gmail.

  1. Start by copying the email address provided in the step configuration settings to your clipboard.
  1. In Gmail, head to your settings and select the Forwarding and POP/IMAP tab at the top of the page. Select Add a forwarding address and paste the email address into the form.
  1. A new modal will pop up letting you know a confirmation code has been sent to the @inbound.parabola.io email address. Click OK.
  1. Check your inbox to see a new email with the subject line Sorry we were unable to process your email attachment. The body of the email will contain a confirmation code and a verification link—both of which can verify the email address.
  1. Click the link and a new window will appear. Click Confirm to start forwarding mail.
  1. Once complete, a Confirmation Success! page will confirm that the @inbound.parabola.io email is verified.
  1. Next, head to your settings and create a filter to target the emails you want to auto-forward. Select Create filter once complete.
  1. Lastly, select Forward it to: and choose the @inbound.parabola.io email address that was recently verified.

Auto-forwarding is now set up to trigger your flow! Please note, you will need to do this each time you create a new flow using this step.

Pull multiple file attachments

By default, Flows will run with the first valid attached file. If you want the Flow to run through multiple attached files (multiple attachments on one email), open the “Email trigger settings” modal and change the setting to “Run the Flow once per attachment:”

(Access these settings from the Pull from Email attachment step, or from the Flow trigger settings on the published Flow page.)

For emails with multiple files attached, the Flow will run once per file received, sequentially.

  • Files must be of the same type (CSV, XLS, PDF, or JSON) for the runs to process.
  • The file type is defined in the initial step settings (”File type” dropdown).
  • Any files received that are of a different type will cause a Flow run error.

Pull subject and body

We also support the ability to pull in additional information about an email, including:

  • Email Body
  • Subject Line
  • Sender email address
  • CC'd Emails
  • File Attachment Name

To access these fields, you can toggle the "Pull data from" field to pull in Email subject and body. If you'd like to pull both an attachment and the subject and body, you can use two separate steps to pull in both of these datasets.

Pull a sheet from an Excel file based on file position

Use the "position is" option when pulling in an attached Excel document to specify which sheet to pull data from by its position, rather than its name. This is great for files that have key data in consistent sheet positions, but may not always have consistent sheet names.

When using this option, only the number of sheets that are in the last emailed file will show in the dropdown. If a Flow using these settings is run and there is no sheet in the specified position, the step will error.

Helpful tips

  • This step will run every time the dedicated email address receives a new attached file. This is useful for triggering your flow to run automatically, outside of a dedicated schedule or webhook.
  • If your XLS file has multiple sheets, this step auto-selects the first sheet but can be set to look for a specific sheet.
  • This step can handle attached files that are up to 5MB.
  • Each run of a Flow uses one file. If your Flow has multiple Pull from Email Attachment steps, they will all access the same email / file.
  • What happens when multiple emails are received by your flow: If your flow is processing and another email (or multiple) comes in, then they will queue up to be pulled into your flow in the order they were received. All emails sent to a flow (up to 1,000 total) will be queued up and processed.
  • By default, emails that are sent to Flow email addresses must have a valid attachment. You can disable that, and allow emails without attachments, by accessing the email trigger management modal and disabling the checkbox.

Related Recipes

Working with data from PDF files

See below for an overview of how to understand and configure your PDF data in Parabola

Understanding your PDF data

Parabola’s Pull from PDF file step can be configured to return Columns or Keys

  • Columns are parts of tables that are likely to have more than one row associated with them
  • Keys are single pieces of data that are applicable to the entire document. As an example - “Total” rows or fields like dates that only appear once at the top of a document are best expressed as keys
  • Sometimes AI can interpret something as a column or a key that a human might consider the other. If the tool is not correctly pulling a piece of information, you might try experimenting with columns versus keys for that data point
  • Both columns and keys can be given additional information from you to ensure the tool is identifying and returning the correct information - more on that below!

Step Configuration

Selecting PDF Data

Once you have a PDF file in your flow, you will see a prompt for the second step - “Select table columns,” where you will provide information to the tool to determine what fields it should extract from the file. Parabola offers three methods for this configuration -

  1. Use an auto-detected table (default)
  2. Define a custom table
  3. Extract all data

First, we’ll outline how these choices will impact your results and then we will discuss tips and best practices for fine tuning these results:

  1. Use an auto-detected table
    • This selection, which is the default, will send the file through our PDF parsing pipeline, where our LLM will identify tables within the document, identify posible columns, name them, and extract available values.
    • Once this step finishes its first calculation, you should see a table selected with a set of columns. You can always add more columns to this original list - see the Manual Inputs selection below for more info!
    • Note that initial auto-detection does not provide any keys, however there is an option to do a full document or key specific auto-detect to have the tool provide this values
  2. Define a custom table
    • If you don’t want for the step to take a first pass at auto-detection, or, if the auto-detection is excluding columns you were hoping to extract, you can manually define a specific table. This is an advanced feature that can extract data from tables that are not obvious to the AI. Auto-detected Tables are easier to work with, but if the AI did not find your table, try defining it with this custom setting.
  3. Extract all data
    • This option will use primarily use OCR instead of an LLM to process your file. As a result of this, it is  discouraged for most use cases.
    • Should you want to use this option, however, we provide four options for how you’d like your data returned:
      • All data: this will return all of the data in the PDF, listed as one value per row
      1. Table data: this will return only data from OCR-identified tables within the PDF file. If your file has multiple tables, each will have a unique ID (which you can use to later filter results, for example), and results will be returned sequentially (e.g. table 1, then table 2, and so on). Note: tables that span multiple pages will be broken into individual tables for each page
      2. Key-Value pairs: this will return all identifiable key/value pairs – things that are clearly associated or labeled, such as “color: red” or “Customer name- Parabola”
      3. Raw text: this will return all of the PDF data, in a single cell (one cell per file page). This format is most useful if you plan to apply an AI step, like Extract or Categorize

Manual Inputs

  • Parabola’s ability to make informed predictions on what fields you might be expecting from any PDF file is incredibly powerful, but you can always provide additional inputs to the tool to increase accuracy and ensure the output is aligned with your needs. Generally speaking, the more input you provide, the more accurate the results!
  • Adding columns or keys is as easy as clicking the “+ Add column” or “+ Add key” button, which will open a prompt box. Here are a few tips for best results:
    1. Column and key names can be descriptive or instructive, and do not need to match exactly what the PDF says. However, you should try to ensure the name is something that the underlying AI can associate with the desired column of data
    2. Providing examples is the best way to increase the accuracy of column (or key) parsing
    3. The “Additional instructions to find this value” field is not required, however, here you can input further instructions on how to identify a value as well as instructions on how to manipulate that value. For example in a scenario where you want to make two distinct columns out of a singular value in the file, say an order number in the format “ABC:123".  You might use the prompt - “Take the order ID and extract all of the characters before the “:” into a new column”

See below how in this case with handwriting, with more instructions the tool is able to determine if there is writing next to the word “YES” or “NO”

Fine Tuning

  • In addition to column or key specific inputs, you can use the “Fine Tuning” to more holistically explain the type of document you are providing as well as additional information on the expected outcome. As always, if you have an expected outcome that is not being auto-generated, more examples and inputs should always help to improve accuracy

Advanced Settings

Parabola’s Pull from PDF step has four additional configurations:

  1. Use advanced text extraction. Defaults to FALSE. If toggled on, this step will use a more sophisticated version of OCR text extraction that can be helpful for complex documents such as those with handwriting. This more advanced model will also result in the tool running slower, and as a result, we suggest only toggling this on if you are not satisfied with the results from simple text extraction. Note that if a run fails with simple text extraction, Parabola uses advanced extraction by default in re-trying
  2. Retry step on error. Defaults to TRUE. LLMs can occasionally return unexpected errors and often times, re-running the step will resolve the issue. When checked, this step will automatically attempt to re-run one time when encountering an unexpected error
  3. Auto-update prompt versions. Defaults to FALSE. Occasionally Parabola updates step prompts in order to make parsing results more accurate/reliable. These updates may change output results, and as a result, auto-updating is turned off by default. Enable this setting to always use the most reset prompt versions.
  4. Page Filtering. Defaults to FALSE. Allows for user to define the specific pages of a document to run through the Pull from PDF step. If you only need specific values that are consistently on the same page(s), this can drastically improve run time.

Usage tips & Other Notes

  • This step can take many minutes to run! Grab a coffee and relax while the AI does the work for you. The more document pages that are needed for parsing, the longer it may take. To expedite this process, you can configure the step to only review certain pages from your file. The fewer the pages, the faster the results!
  • If you need to pull data across multiple tables (from a single file), you will likely need multiple steps – one per table.
  • File size: PDF files must be <500 MB and 30 pages
  • PDFs cannot be password protected
  • We recommend always auditing the results returned in Parabola to ensure that they’re complete

Using child columns

Mark columns as “Child columns” if they contain rows that have values unique from the parent columns:

Before

After marking “Size” as a child column

Related Recipes