5 min readUpdated Mar 2, 2026

dataClassify Documentation

Purpose

The dataClassify logic is designed to assign category labels to each row of data based on specific conditional rules. It evaluates a set of predefined conditions on various columns of the input data, where the first rule that matches a condition triggers the assignment of a specified label to the output column. If none of the rules match for a row, the logic assigns a default label.

Expected Data

The dataClassify logic expects a structured input in the form of an array of objects, with each object representing a data row that may contain various fields. The component processes these data rows and requires the data to be compatible in terms of the column names specified in the rules defined within the configuration.

Settings

The dataClassify logic accepts the following configuration settings:

1. outputColumn

2. defaultLabel

3. rules

4. Conditions within Rules

The individual conditions that can be applied via the operator property in each rule include:

These operators provide a range of comparisons for categorizing data, and their effects depend on the data type and value of the column being evaluated.

How It Works

  1. The dataClassify function first unwraps the incoming data to ensure it is in the correct format.
  2. It then initializes results by iterating over each data row.
  3. For each row, it evaluates rules in order:
    • If a rule matches, it assigns the corresponding label to the outputColumn.
    • If a row does not match any rule, the defaultLabel is assigned.
  4. Results are compiled into a new data structure and returned.

The evaluation of conditions uses a helper function, evaluateCondition, which checks the value of the specified column against the operator and rule value, executing the appropriate comparison logic.

AI Integrations

Currently, dataClassify does not include any direct AI integrations. It relies solely on the specified rules and conditions provided by the user to classify data based on set logic.

Billing Impacts

Utilizing the dataClassify logic may incur costs based on the volume of data processed and the complexity of the configured rules. The more complex and numerous the rules, combined with large datasets, may increase processing time and resources, potentially leading to higher billing if the platform has a usage-based pricing model.

Use Cases & Examples

Use Case 1: Sales Categorization

A company wants to categorize its customer sales data into segments: "Enterprise", "Mid-Market", and "Other". By applying the dataClassify, sales representatives can better target their marketing efforts.

Use Case 2: Customer Segmentation

A marketing department uses the logic to classify customers based on their engagement levels with various products, allowing tailored advertising for different demographics.

Example Configuration

Use Case: A company wants to segment its customers based on their revenue.

Configuration Data:

json
{
  "outputColumn": "customerSegment",
  "defaultLabel": "Low Engagement",
  "rules": [
    { "column": "revenue", "operator": "greaterThan", "value": "10000", "label": "High Value" },
    { "column": "revenue", "operator": "greaterThan", "value": "1000", "label": "Medium Value" },
    { "column": "revenue", "operator": "greaterThan", "value": "100", "label": "Low Value" }
  ]
}

This configuration segments customers based on their revenues, assigning them to three different engagement levels, with any revenue below 100 being categorized as "Low Engagement".