Usage & Tokens
All AI operations in Vantage consume tokens. This page explains how token usage works, how to monitor costs, and strategies for managing consumption.
What Are Tokens?
Tokens are the fundamental unit of AI processing. Every request to an AI provider — whether it's a question to the assistant, a tile summary, or a workflow enrichment — consumes tokens based on:
- Input tokens — the data and prompt sent to the provider
- Output tokens — the response generated by the provider
As a rough guideline, 1 token ≈ 4 characters of English text, or approximately ¾ of a word.
What Consumes Tokens
| Feature | Token Impact | Typical Range |
|---|---|---|
| AI Assistant (single question) | Low–Medium | 500–2,000 tokens |
| Tile Summary | Low–Medium | 300–1,500 tokens |
| Popup AI Chat | Medium | 500–3,000 tokens per message |
| AI Enrichment (per row) | Low | 100–500 tokens per row |
| AI Summary (batch) | Medium–High | 1,000–5,000 tokens |
| AI Compliance Check (per row) | Low | 100–300 tokens per row |
| AI Formatter (per row) | Low | 50–200 tokens per row |
| AI Conditional (per row) | Low | 100–300 tokens per row |
| AI Transcriber | High | Depends on audio/video length |
Note: Actual token usage depends on the size of the data, the length of context snippets, conversation history, and the model's response.
Cost Factors
Token costs vary based on several factors:
Provider & Model
More capable models cost more per token:
| Tier | Examples | Relative Cost |
|---|---|---|
| Budget | GPT-4o-mini, Mistral Small, DeepSeek Chat | $ |
| Standard | GPT-4o, Claude 3.5 Sonnet, Gemini Pro | $$ |
| Premium | Claude 3 Opus, Gemini Ultra, Mistral Large | $$$ |
Data Volume
- More rows in a tile or workflow node = more input tokens
- Enabling Process Large Datasets sends all rows (more tokens, more accuracy)
- Disabling it samples data (fewer tokens, faster but potentially less accurate)
Context Length
- Longer context snippets = more input tokens per request
- Multi-turn conversations accumulate history = increasing tokens per message
- Detailed custom instructions add tokens to every request
Monitoring Usage
Usage Dashboard
Access the Usage Overview at Settings → Account → Usage & Tokens.
Scope Options
View usage at different levels:
| Scope | Shows |
|---|---|
| User | Your individual token consumption |
| Client | Usage across your client group |
| Organization | Total consumption for the entire organization |
Time Period Options
- Preset ranges: This month, Last 7 days, Last 30 days, Last 90 days
- Custom date range: Select any start and end date
What You'll See
- Total credits consumed in the selected period
- Comparison to the previous period (% change)
- Breakdown by operation type (assistant, summaries, workflow, etc.)
- Breakdown by category (which feature areas use the most tokens)
- Usage chart with hourly or daily granularity
- Individual events with details on each AI operation
Filtering & Analysis
The Usage Dashboard supports filtering to zero in on specific patterns:
| Filter | Options |
|---|---|
| Operation type | Assistant, Summary, Workflow, Chat, etc. |
| Category | Dashboard AI, Workflow AI, System |
| Time granularity | Hourly, Daily |
Exporting Usage Data
Download your usage data for external analysis or billing reconciliation from the Usage Dashboard.
Cost Management Strategies
1. Choose the Right Model
Use budget-tier models (GPT-4o-mini, Mistral Small) for routine tasks and reserve premium models for complex analysis.
2. Enable Data Sampling
Unless you need 100% data accuracy, disable Process Large Datasets in Settings → AI Features → Query Settings. Sampling processes a representative subset, significantly reducing token usage.
3. Optimize Context Snippets
Keep context snippets concise. A focused 2-sentence company overview is more cost-effective (and often more useful) than a 5-paragraph description.
4. Limit Conversation Length
Long multi-turn AI conversations accumulate token costs as the entire conversation history is resent with each message. Start new conversations for unrelated questions.
5. Review Workflow Volume
AI workflow nodes process data rows individually. A workflow processing 10,000 rows through an AI Enrichment node will make 10,000 AI calls. Consider:
- Adding filter nodes before AI nodes to reduce row count
- Using conditional branching to only process rows that need AI
- Batching where possible
6. Monitor Regularly
Check the Usage Dashboard weekly or monthly to catch unexpected consumption patterns early.
Usage Alerts
Set internal guidelines for token consumption and review the Usage Dashboard regularly. Consider establishing:
- Monthly token budgets per team or department
- Review thresholds that trigger a usage audit
- Workflow limits on the number of rows processed by AI nodes in a single execution