4 min read

Projection

Overview

The "projection" logic in the Vantage analytics and data platform is designed to transform data sets by projecting specific fields from input data while also allowing for the renaming of these fields. This functionality is critical for data shaping, enabling users to focus on relevant data points and customize output formats as needed.

Settings

The projection logic comprises a configuration object that includes several settings, each of which allows for specific customization of the projection operation. Below is a detailed breakdown of each setting:

1. fields

Input Type: Array of strings
Description: The fields setting specifies which fields from the incoming data should be included in the output. If this array is empty, all fields will be projected. Changes to this setting directly affect which fields are retained in the output dataset.
Default Value: [] (empty array)

2. rename

Input Type: Object (key-value pairs)
Description: The rename setting allows users to define a mapping of original field names to new field names. If a field name specified in the fields array has a corresponding entry in this object, the output will use the new name instead of the original. This setting provides flexibility in terms of understanding and usability of the output data.
Default Value: {} (empty object)

How It Works

The node creates a projection node that is structured to process incoming data arrays. Here’s an overview of its operation:

Validation: The function first checks if the input data is an array. If it is not, it returns an empty array.
Field Mapping: If the fields array is empty, it returns the original data unaltered. If there are fields specified, it constructs a new array by mapping over each row of the input data.
Field Extraction & Renaming: For each specified field in fields, it checks if there is a corresponding entry in the rename object to determine if the field should be renamed. It builds and returns an output object with the selected fields and their values according to the specified rules.

The execute method encapsulates the core logic for transforming the input data according to the settings defined during node creation.

Expected Data

Input Data Type: An array of objects, where each object represents a row of data containing key-value pairs.
Field Names: The keys within the objects must match those specified in the fields array to properly extract values.

AI Integrations

While the current implementation of the projection logic does not directly integrate with AI features, it can serve as a preprocessing step for AI models by curating datasets that are explicitly relevant for machine learning training or inference tasks. By streamlining datasets through projection and renaming, it can help enhance the input quality for AI algorithms.

Billing Impacts

The projection logic itself does not incur additional costs directly; however, the overall processing of data, including projections, contributes to the level of resource utilization on the Vantage platform. Users should be aware that extensive data transformations may result in increased usage metrics, potentially leading to higher billing based on their subscription plan.

Use Cases & Examples

Use Cases

Data Standardization: A marketing analytics team needs to prepare a clean dataset from various sources for analysis. They can use the projection logic to extract relevant fields and rename them for consistency across reports.
Dashboard Creation: A data visualization team needs to create an executive dashboard that only showcases specific KPIs. The projection component can help by allowing them to select only these KPIs and provide clear labels.
Data Pipeline Preparation: A data engineering team is collecting data from diverse sources and needs to preprocess the data for further transformations. The projection can streamline necessary fields and improve the efficacy of processing steps downstream.

Example Configuration

Use Case: Data Standardization for Marketing Reports

Suppose a marketing analytics team needs to prepare a dataset that includes only the email and purchase amount fields from an input dataset, while renaming "purchaseAmount" to "totalSpent" for clarity.

Sample Configuration:

javascript

const projectionNode = createProjectionNode({
  fields: ['email', 'purchaseAmount'],
  rename: {
    purchaseAmount: 'totalSpent',
  },
});

When executed with the following input data:

javascript

const inputData = [
  { email: 'user1@example.com', purchaseAmount: 100, otherField: 'abc' },
  { email: 'user2@example.com', purchaseAmount: 150, otherField: 'def' },
];

The output will be:

javascript

[
  { email: 'user1@example.com', totalSpent: 100 },
  { email: 'user2@example.com', totalSpent: 150 },
]

This output format is immediately ready for analysis or reporting, showcasing only the relevant fields with clear naming.

← PreviousComputed Column Next →Data Validation

Projection

Overview

Settings

1. fields

2. rename

How It Works

Expected Data

AI Integrations

Billing Impacts

Use Cases & Examples

Use Cases

Example Configuration

Related Pages