Aggregation Logic
Purpose
The Aggregation Logic in Vantage is designed to group input data by specified columns and apply various aggregation functions, such as count, sum, average, minimum, maximum, and others. This functionality is essential for users who need to summarize datasets and derive meaningful insights from raw data, helping organizations to make data-driven decisions effectively.
Settings
The Aggregation Logic has several configurable settings. Each setting plays a vital role in configuring the logic's behavior, and understanding them is key to leveraging its full potential.
1. groupBy
- Input Type: Array of Strings or Objects
- Description: This setting specifies the columns that the data will be grouped by. Each entry can either be a string, indicating the column name, or an object specifying the column and a datePeriod (for date-based aggregations). Adjusting this setting alters which categories the data is segmented into.
- Default Value:
[](empty array)
2. aggregations
- Input Type: Array of Objects
- Description: This holds the specifications for which aggregations to apply to the grouped data. Each object must contain a column to aggregate, the aggregation function to use, and an alias for the result. Modifying these parameters changes what metrics are calculated and how they are represented in the output.
- Default Value:
[](empty array)
Aggregation Object Structure
- Properties:
column: (String) The name of the column to apply the aggregation function on.function: (String) Specifies the aggregation method to use, such as "sum", "count", "avg", etc.alias: (String, optional) A custom name for the aggregated output field.
3. pivotBy
- Input Type: Object
- Description: This optional setting allows the transformation of date periods into columns, enabling a pivot table-like structure. It includes a column name for pivoting and a date period (e.g., month, week, year). If configured, the output data structure will feature date periods as distinct columns.
- Default Value:
undefined
Pivot Object Structure
- Properties:
column: (String) The column that contains date values for the pivot.datePeriod: (String) Specifies the granularity of the pivot, with options including "day", "week", "month", or "year".
How It Works
The Aggregation Logic processes input data by first checking its format. It can accept input data as an array or as an object that wraps an array in a property named data. If the input data is in an incompatible format or empty, the function returns an empty dataset.
The logic then follows these steps:
- Normalization: The
groupByentries are normalized to objects containing the column name and optional date period. - Aggregation Execution: Based on the presence of
groupBy,aggregations, andpivotBy, the function decides whether to pass the data through unchanged, perform aggregations on the entire dataset, or group the data based on the specified columns. - Group Formation: When grouping data, it organizes it into a
Map, where keys represent composite keys generated from the grouping columns. - Aggregation Application: For each group, it calculates the specified aggregations and compiles the result into the output format.
- Pivoting (if applicable): If a pivot is specified, it transforms the grouping data such that date periods become separate columns.
Expected Data
The Aggregation Logic expects structured data that includes the columns defined in the groupBy, aggregations, and (if used) the pivotBy settings. This data should ideally be an array of objects where each object represents a row, ensuring that each column can be accessed by name.
Example Input Data Structure
[
{"id": 1, "sales": 100, "price": 20, "order_date": "2023-01-15"},
{"id": 2, "sales": 200, "price": 25, "order_date": "2023-01-15"},
{"id": 3, "sales": 150, "price": 30, "order_date": "2023-02-15"}
]Use Cases & Examples
Use Case 1: Sales Performance Reporting
A retail company wants to analyze its sales performance by month and product category. By grouping data by product categories and month, they can aggregate crucial metrics like total sales and average price.
Use Case 2: Website Visitor Analysis
A web analytics team needs to evaluate visitor counts over differing traffic sources for each week. By applying aggregations over a date period, they can derive insights on traffic trends and optimize their marketing strategies.
Example Configuration
For a practical implementation, suppose we want to derive total sales (sum), average price (avg), and count of distinct visitor IDs (count) grouped by order_date by month.
{
"groupBy": ["order_date"],
"aggregations": [
{ "column": "sales", "function": "sum", "alias": "total_sales" },
{ "column": "price", "function": "avg", "alias": "average_price" },
{ "column": "id", "function": "count", "alias": "count_visitors" }
],
"pivotBy": {
"column": "order_date",
"datePeriod": "month"
}
}Expected Output
Given the above configuration and input data, the output will include aggregated results showcasing total sales, average price, and count of visitors for each month, where months are specified as column headers.
{
"output1": {
"data": [
{ "order_date": "2023-01", "total_sales": 300, "average_price": 22.5, "count_visitors": 2 },
{ "order_date": "2023-02", "total_sales": 150, "average_price": 30, "count_visitors": 1 }
]
}
}AI Integrations
The Aggregation Logic can seamlessly integrate with AI tools within the Vantage platform, enabling automated insights and advanced analytical capability. Users can utilize AI-driven forecasts based on past aggregation trends to predict future metrics.
Billing Impact
Using the Aggregation Logic may have implications for billing based on the volume of data processed and the complexity of aggregations performed. Users are encouraged to review their pricing tier and monitor usage to ensure it aligns with their budget and operational goals. Aggregation operations typically consume more processing power, which may lead to increased billing if the volume of data is significant.
The Aggregation Logic in Vantage serves as a powerful tool for data processing and summarization, enabling organizations to extract meaningful insights effortlessly.