5 min read

Documentation for `join` Logic

Purpose

This node in the Vantage analytics & data platform is designed to combine datasets based on common attributes, allowing users to merge data streams from various sources for integrated analysis. Its primary function is to match and merge records from a left dataset with those from a right dataset, using specified keys to find corresponding rows.

How It Works

The node operates by accepting parameters that define the left and right datasets and the keys used for the join operation. It facilitates inner joins predominantly, where records are created only when matching keys are found in both datasets.

Input Parameters:
- rightDataset: an array of records that represents the right dataset.
- leftKey: a string that indicates the key in the left dataset used for matching.
- rightKey: a string that indicates the key in the right dataset used for matching.
Execution:
- The execute method takes in the left dataset, iterates through each record of both datasets, and creates a new record combining matching pairs based on the keys provided.

Settings

1. rightDataset

Input Type: Array
Description: This setting holds the dataset that will be joined against the left dataset. It is expected to be an array of objects where each object represents a row of data.
Behavior: Failing to provide a valid array will result in an error being thrown, preventing the join operation from proceeding.
Default Value: None (required).

2. leftKey

Input Type: String
Description: This setting specifies the key from the left dataset which will be used to match with the corresponding key from the right dataset.
Behavior: If an invalid key is supplied or left blank, an error will be thrown during execution. It directly influences which field is evaluated for matches against the right dataset.
Default Value: None (required).

3. rightKey

Input Type: String
Description: This setting denotes the key from the right dataset used for matching against the key from the left dataset.
Behavior: Similar to leftKey, if this key is either blank or invalid, an error will be issued, dictating that successful joins depend on valid keys.
Default Value: None (required).

Expected Data

This node requires two datasets:

Left Dataset: An array of objects representing records to be joined, where each object should contain the leftKey.
Right Dataset: An array of objects representing the records to join with the left dataset, containing the rightKey.

Both datasets must be structured correctly for the join to proceed without errors.

Use Cases & Examples

Use Case 1: Customer Data Integration

A company may wish to analyze its sales data (left dataset) alongside customer information (right dataset) to assess the performance of sales across different customer demographics. By using a join on customer ID, the company can enrich sales data with relevant customer attributes.

Use Case 2: Historical Data Analysis

Organization A needs to merge its historical sales records (left dataset) with the corresponding product information (right dataset) to generate comprehensive reports that analyze product sales performance over time.

Use Case 3: Survey and Response Matching

An analytics team may receive survey responses (the left dataset) and wish to combine these with demographic data (the right dataset) to analyze feedback in contextual categories, utilizing respondent IDs as keys.

Configuration Example

Business Use Case: Customer Sales Analysis

To analyze customer sales data effectively, we want to join the sales dataset with the customers dataset.

Sample Datasets:

json

// Sales Data (Left Dataset)
const salesData = [
  { saleId: 1, customerId: 'C001', amount: 250 },
  { saleId: 2, customerId: 'C002', amount: 150 },
  { saleId: 3, customerId: 'C001', amount: 200 }
];

// Customers Data (Right Dataset)
const customersData = [
  { customerId: 'C001', name: 'Alice' },
  { customerId: 'C002', name: 'Bob' },
  { customerId: 'C003', name: 'Charlie' }
];

Configuration:

javascript

const joinNode = createJoinNode({
  leftDataset: salesData,
  rightDataset: customersData,
  leftKey: 'customerId',
  rightKey: 'customerId'
});

// Executing join
const mergedData = joinNode.execute(salesData);
console.log(mergedData);

Result:

The output will be: