Counting by group
The Count by group step allows you to quickly summarize data by counting occurrences within your dataset. It’s particularly useful for flagging duplicate values that exist in a dataset.
How to use the step
- Drag a Count by group step onto the canvas.
- Select the column(s) that you want to count unique values within.
- Ex) If you want to count up how many orders were placed each month, you would just select the Month column.
- If you want to count up how many orders were placed each month by state, you would select both Month and State.
- Ex) If you want to count up how many orders were placed each month, you would just select the Month column.
Pro tip
- If you want to alert the team of duplicates in your dataset, after the Count by group step, use the Filter rows and Email a CSV attachment steps to notify the team about duplicate values.
To learn more, check out our support docs.
Using the Count by Group Step in Parabola
The Count by Group step is a powerful tool for summarizing data, helping you quickly analyze trends and detect duplicates. If you haven’t already, we recommend watching the Sum by Group video in Parabola University to get familiar with grouping concepts before diving into this step.
How the Count by Group Step Works
This step counts the number of times each unique value appears in a selected column.
Example 1: Analyzing Seasonality by Month
Let’s say we want to analyze seasonality by counting the number of orders per month:
- Select the column to group by (e.g., Month).
- Set the count column name (e.g., Order Count).
- Click ‘Show Updated Results’, and Parabola will count the occurrences of each month in the dataset.
This instantly provides a month-by-month breakdown of order volume.
Example 2: Counting Orders by State & Month
To take this further, let’s analyze which states purchase more during specific times of the year:
- Add the State column as an additional grouping field.
- Click Show Updated Results to see a breakdown by both Month and State.
This insight can help determine where to push ads or adjust marketing efforts based on seasonal demand.
Example 3: Detecting Duplicates in a Dataset
If you suspect duplicate records but don’t want to automatically remove them, you can use Count by Group as an alert system:
- Group by a unique identifier (e.g., Order ID, Email Address).
- If any count is greater than 1, that means duplicates exist.
- Use a Filter Rows step to flag rows where Count > 1.
- Optionally, set up an email alert to notify the team when duplicates are detected.
Try It Yourself!
Experiment with the Count by Group step in the building challenge below, and let us know if you have any questions! 🚀