Use PDF file
Parabola’s PDF parsing enables you to pull in data from a PDF file in two steps:
- Pull from PDF file step
- Email attachment step
Upload PDF file
Use the Pull from PDF file step to work with a single PDF file. Upload a file by either dragging one into the outlined box, or select "Click to upload a file."
At launch, you can use this step at no extra charge to your team. If usage exceeds a reasonable threshold, we may contact you to upgrade to an automation package.
Email PDF file attachment
The Email attachment step can pull in data from attached PDF files. The way that the PDF is parsed can be adjusted with the accompanying settings.
Pulling data from PDF files
The way that PDF files are parsed can be adjusted with the accompanying settings.
When working with a PDF file, you can choose the desired data format based on how you plan to transform the data in your Parabola flow. From the “Data format” dropdown, you have the following options:
- All data: this will return all of the PDF data, organized into rows
- Table data: this will return only data from identified tables within the PDF file.
– If your file has multiple tables, each will have a unique ID (which you can use to later filter results, for example), and results will be returned sequentially (e.g. table 1, then table 2, and so on).
– Note: tables that span multiple pages will be broken into individual tables for each page
- Key-Value pairs: this will return all identifiable key/value pairs – things that are clearly associated or labeled, such as “color: red” or “Customer name- Parabola”
- Raw text: this will return all of the PDF data, in a single cell (one cell per file page). This format is most useful if you plan to apply an AI step, like Extract or Categorize
For the “Table data” and “Key-Value pairs” formats, you can automatically pivot your results using the checkbox that appears in the step settings.
- File size: PDF files must be <500 MB and 3,000 pages
- Languages supported: English, French, German, Italian, Portuguese, and Spanish
- PDFs cannot be password protected
- The maximum height and width is 40 inches and 2,880 points
- The minimum height for text to be detected is 15 pixels (~8 point font)
- We recommend always auditing the results returned in Parabola to ensure that they’re complete