The Combine Tables step joins multiple tables into one by matching rows between those tables. It is similar to a VLOOKUP in Excel or Google Sheets. In SQL, it's like a join clause.
The principle is simple: if we have two tables of data that are related to each other, we can use the Combine Tables step to join them into one table by matching rows between those tables.
This step can handle combining two tables at a time. Once we set it up, we can use it repeatedly. Even if the row amount changes, our step will continue working.
Check out the video below to learn more about the Combine Tables step. (Please note that this step was formally referred to as Join.)
The Combine Tables step requires two data inputs to combine. In our example below, we have two tables feeding into it. The first one is a table of Variant IDs, Product Name, Variant Name, and Sale Price. The second one is a table of Variant IDs and Stock Quantity.
After using the Combine Tables step, our output data (shown below) has combined the "Stock Quantity" column from the second table to the first table using "Variant ID" as the matching identifier.
After connecting two datasets into this step, in the left-side toolbar choose whether you will Keep all rows or Keep only matching rows for your data sources using the drop down menus.
Then in the section above the button + add rule, click select to access the dropdown menu of columns and select which ones to join the tables by. The column you choose should be present in both datasets, have identical values, and be used as the unique key identifier between them.
Once you're done, select the button Show Updated Results to save and display the newly-combined tables.
If the datasets you want to combine don't have a shared column of identical values, then you can use the step Insert row numbers to make one in each table. You'd do this by connecting your import datasets to an Insert row numbers step for generating a column with incrementally-increasing numbers.
Other ways to combine tables:
Now we've learned one example using our default settings. Let's explore the other ways that we ca combine tables in Parabola. We can switch our Combine tables step to keep all rows that have matches in all tables. The resulting table will only contain rows that exist in every table we are combining.
The last option is to keep all rows in all tables. This will create a resulting table that has every row from every table present. If no match was found for a specific row, it will still exist. This option has a tendency to create a lot of blank cells, and can be tricky to use properly.
- The default setting for the Combine Tables step is the most common way to combine tables. It keeps the entire primary table and finds matches in the other tables, fitting them in to their matched rows as we go. If a row doesn't have a match in the primary table, it won't show up in the results.
- To combine three or more tables together, chain together multiple Combine Tables steps to merge your data.
- If we need multiple rules to determine a match, by default we'll determine row matches if any rules apply. We can change this so that rows match only when all of the rules apply. Notice that the word "or" changes to "and" when we update this setting.
Here are a few common questions that come up around the Combine tables steps and ways to troubleshoot them:
Issue: I’m seeing my rows multiplied in the results section of the Combine tables step.
Solution: It’s possible that you are matching on data that is duplicated in one or both of your sources. Check your inputs to make sure the selected matching column have unique data in both inputs. If you don’t have a column with unique data, you may need to choose multiple columns to match on to make sure you’re not duplicating your rows.
Here is a quick video walking through this process:
Issue: I’m seeing a Missing columns error in the Combine tables step.
Solution: This can happen when you make changes to one of your inputs after you’ve already selected the setting in your Combine tables step. You’ll need to either reselect your matching columns in the step or pull in a new Combine tables step.