imageAnalysis Logic Documentation
Overview
The imageAnalysis logic node is a component of Vantage's powerful analytics framework that leverages multimodal language models (LLMs) such as GPT-4o, Claude 3+, and Gemini to perform advanced analysis on images. The primary function of this node is to examine images referenced by their URLs within the input dataset and return detailed insights based on specified analysis types. This allows for automation of tedious image analysis tasks across various business applications.
Purpose
The imageAnalysis logic is designed to automate the extraction of meaningful information from images, providing users with capabilities such as:
- Describing the content of an image.
- Generating captions.
- Extracting visible text.
- Classifying images into their respective categories.
- Supporting custom prompts for specialized analysis.
Expected Data
The imageAnalysis logic expects data in the following format:
- An array of objects, where each object contains at least one property representing the image URL (default is
image_url). - The images need to be accessible via a public URL for analysis.
Example Input Data Shape:
[
{"image_url": "http://example.com/image1.jpg"},
{"image_url": "http://example.com/image2.jpg"}
]Settings
The configuration of the imageAnalysis logic comprises several settings that dictate its behavior and performance. Below is a detailed explanation of each setting:
-
imageColumn
- Input Type: String
- Description: This is the name of the column in the input dataset that contains the image URLs. By changing this value, users can specify the correct key for the image URLs in their data structure.
- Default Value:
"image_url"
-
analysisType
- Input Type: Dropdown (string)
- Description: This setting determines the type of analysis to be performed on the image. Users can select from the following options:
describe: Provides a detailed description of the image.caption: Generates a concise caption.extract_text: Extracts visible text from the image.classify: Classifies the image into a category.custom: Allows users to specify a custom prompt for analysis.
- Default Value:
"describe"
-
outputColumn
- Input Type: String
- Description: This is the name of the column in the output dataset that will hold the results of the AI analysis. Modifying this value allows users to customize where the results are stored in the output structure.
- Default Value:
"ai_vision"
-
customPrompt
- Input Type: String
- Description: Users can provide a custom prompt to guide the AI's analysis of the image when
analysisTypeis set tocustom. This enables tailored analysis based on specific requirements. - Default Value:
""(empty string)
-
batchSize
- Input Type: Numeric
- Description: This setting defines how many images will be processed concurrently in a single batch. Adjusting this value impacts the performance and speed of the execution; larger values can speed up processing but may increase the load on the AI integration.
- Default Value:
5
Use Cases & Examples
Use Cases
-
E-commerce Product Analysis: An e-commerce platform can utilize the
imageAnalysislogic to automatically generate product descriptions and captions for images uploaded by vendors. This can help maintain a consistent format and improve the user experience. -
Social Media Content Management: A social media management tool can analyze images before posting, extracting text or generating appropriate captions based on the content of the images, thus optimizing post engagement.
-
Document Scanning and Archiving: Companies dealing with document images can extract text from scanned files to populate databases or archive contents efficiently.
Example Configuration
Use Case: E-commerce Product Analysis
In this scenario, an e-commerce platform wants to automate the generation of product captions and detailed descriptions for their image assets.
Sample Configuration:
{
"imageColumn": "product_image_url",
"analysisType": "describe",
"outputColumn": "product_analysis",
"customPrompt": "",
"batchSize": 5
}Sample Input Data:
[
{"product_image_url": "http://example.com/product1.jpg"},
{"product_image_url": "http://example.com/product2.jpg"}
]Expected Output:
{
"output1": {
"data": [
{"product_image_url": "http://example.com/product1.jpg", "product_analysis": "A beautiful red sweater made of wool, displayed on a wooden table."},
{"product_image_url": "http://example.com/product2.jpg", "product_analysis": "A modern stainless steel toaster on a kitchen countertop."}
]
}
}In this configuration, the imageColumn is set to "product_image_url" to match the expected input. The analysisType is set to "describe" to generate detailed descriptions of each product based on the images provided. The outputs will be stored in the product_analysis column.