WorkflowNode Documentation
Overview
The WorkflowNode comprises the foundational components used in Vantage's powerful analytics and data platform. This component acts as a modular building block within workflows, allowing users to define various operations (e.g., transformations, aggregations, filtering) through visual programming. Each WorkflowNode can represent a different operation type and can be easily connected to other nodes, facilitating the construction of complex data processing flows.
Purpose
The primary purpose of the WorkflowNode is to encapsulate functionality related to analytical operations or data modifications that users can configure according to their requirements. By providing a user-friendly interface to manage and manipulate data, it enhances the efficiency of analytical workflows.
Settings
Each WorkflowNode has several settings that dictate its behavior and appearance. Below is a comprehensive breakdown of each setting available.
General Settings
-
nodeType:
- Type: String
- Description: Specifies the type of the workflow node (e.g., 'queryoperators/filter', 'dbconnectors/dbQuery'). This setting determines what functionality the node provides within the workflow.
- Default Value: None (must be explicitly set).
-
config:
- Type: Object
- Description: A flexible configuration object that contains specific parameters relevant to the node type. This object structure will vary based on the defined
nodeTypeand is crucial for properly initializing node parameters. - Default Value:
{}(empty object).
Component-specific Settings
Depending on the nodeType, certain configurations can be defined. Here are the major components that may affect node behavior.
Query Operations
-
Aggregation Type (
queryoperators/aggregation):-
groupBy: Array of strings.
- Description: Defines the fields by which the data will be grouped during aggregation.
- Default Value:
[](empty array).
-
aggregations: Array of objects.
- Description: Lists fields/operations to apply for aggregating data (e.g., sum, average).
- Default Value:
[](empty array).
-
-
Computed Column (
queryoperators/computedColumn):- columns: Array of objects.
- Description: Contains columns to be computed based on data calculations.
- Default Value:
[](empty array).
- columns: Array of objects.
Deduplication Operations
- Deduplicate Key Columns (
queryoperators/deduplicate):-
keyColumns: Array of strings.
- Description: Specifies the key columns to identify duplicates in the dataset.
- Default Value:
[](empty array).
-
keepStrategy: String.
- Description: Defines which duplicate to keep (FIRST, LAST, etc.).
- Default Value:
FIRST.
-
Union Operations
- Union Settings (
queryoperators/union):-
unionMode: String.
- Description: Determines if the union should include distinct values or duplicates.
- Default Value:
'distinct'.
-
sortColumn: String.
- Description: The column to sort the final unioned dataset by.
- Default Value:
undefined.
-
columnMappings: Object.
- Description: Maps columns from different datasets into a unified scheme.
- Default Value:
{}.
-
Flow Control Operations
- Multi Conditional (
flowcontrol/multiConditional):-
rules: Array of objects.
- Description: Contains rules for conditionally branching the workflow.
- Default Value:
[](empty array).
-
evaluateMode: String.
- Description: Determines how rules are evaluated (e.g., FIRST_ROW).
- Default Value:
'firstRow'.
-
AI Integrations
For nodes integrated with AI functions (e.g., enrichment, compliance checks), additional settings are available:
-
focusColumn:
- Type: String
- Description: Column targeted for AI operations.
- Default Value:
undefined.
-
batchSize:
- Type: Numeric
- Description: Defines the batch size for processing data through AI models.
- Default Value:
100.
Additional Settings and Data Expectations
Each node type can have additional properties. The configuration generally depends on the specific tasks the node is designed to perform.
For example:
- Nodes dealing with databases might have settings for credentialRef, table, operation, and query.
- Nodes like
ai/imageAnalysisorai/textExtractionmight also require specific configurations depending on the type of analysis to be performed.
Use Cases & Examples
Use Cases
-
Data Aggregation: A company uses Vantage to analyze sales data and needs to aggregate total sales by region and quarter.
-
Automated Compliance Checking: A financial institution automates its transaction compliance monitoring to adhere to regulatory requirements using a workflow that incorporates conditional logic and AI.
-
AI-Enriched Messaging: A marketing team analyzes customer feedback using AI to optimize their messaging strategy based on sentiment analysis.
Example Configuration
Use Case: Data Aggregation
To configure a WorkflowNode for aggregating sales data by region and quarter, you might set up the following configuration:
{
"nodeType": "queryoperators/aggregation",
"config": {
"groupBy": ["region", "quarter"],
"aggregations": [
{
"field": "totalSales",
"operation": "SUM"
},
{
"field": "numberOfSales",
"operation": "COUNT"
}
]
}
}In this configuration:
- The
nodeTypeis set to'queryoperators/aggregation'. - The
groupBysetting specifies that data should be grouped by bothregionandquarter, allowing aggregate operations to be computed for each combination.
This detailed layout helps businesses leverage Vantage by building tailored analytics workflows efficiently, enhancing decision-making capabilities.