Multi-Agent Cost Tracking: How to Track Costs Across an AI Agent Fleet

Multi-agent cost tracking attributes AI spend to individual agents, tasks, and clients when multiple agents operate simultaneously — giving firms per-agent and fleet-level cost visibility instead of a single opaque line item.

One coding agent, one research agent, one campaign agent, one document agent. Each burns tokens on a different model, calls different tools, and serves different clients. When the monthly invoice arrives from your model provider, it shows total spend. It does not tell you which agent cost what, which client drove the spend, or which tasks were expensive. That gap between total spend and attributed cost is where budget overruns hide. According to a 2025 survey by an AI infrastructure provider, organisations running five or more agents reported 35-50% higher costs than projected, primarily because they lacked per-agent attribution.

Key Takeaway: Multi-agent cost tracking needs three views: per-agent (which agents are expensive), per-task (which work is expensive), and fleet-level (total spend trends).

Why Is Multi-Agent Cost Tracking Different from Single-Agent Tracking?

Single-agent cost tracking is straightforward. One agent, one model, one cost stream. Every token consumed, every tool called, every task completed — all attributed to that one agent. The maths is simple.

Multi-agent tracking introduces three complications that single-agent systems do not face.

Multiple billing models. Different agents use different models with different pricing. A coding agent on a large reasoning model costs £0.015 per 1,000 input tokens. A summarisation agent on a smaller model costs £0.0003. A tool-calling agent pays per API call rather than per token. Aggregating costs across these different units requires normalisation.

Shared infrastructure. Agents share resources — vector databases, embedding pipelines, orchestration platforms, monitoring tools. These costs belong to the fleet, not to any single agent. Allocating them fairly is an accounting problem with no single correct answer.

Agent-to-agent handoffs. One agent triggers another. A research agent feeds data to an analysis agent, which feeds results to a drafting agent. The client task spans three agents. The cost must roll up to the task, but each agent also needs its own cost entry for fleet management. Double-counting is easy. Accurate attribution takes effort.

For firms already tracking individual agent costs, the jump to multi-agent tracking is not about new metrics. It is about aggregation, attribution, and the relationships between agents.

How Does Per-Agent vs Per-Task Cost Attribution Work?

Both views are necessary. They answer different questions.

Per-Agent Attribution

Per-agent attribution totals the cost incurred by each agent over a period. It answers: which agents are expensive? Which are cost-effective? Where should I optimise?

A per-agent cost report might show:

Agent	Model	Monthly Cost	Tasks Completed	Cost Per Task
Research Agent	Large reasoning model	£1,240	620	£2.00
Coding Agent	Large reasoning model	£2,180	310	£7.03
Summarisation Agent	Small fast model	£85	4,200	£0.02
Campaign Agent	Mid-tier model	£430	180	£2.39

This view immediately shows that the coding agent is the most expensive per task. If cost reduction is the goal, that is where to focus — perhaps by routing simpler coding tasks to a cheaper model.

Per-Task Attribution

Per-task attribution totals the cost of completing a specific task, regardless of how many agents were involved. It answers: how much did this piece of work cost? Can I bill the client enough to cover it?

A single client task — “produce a market entry analysis” — might involve the research agent (£2.00), the summarisation agent (£0.02), and the campaign agent (£2.39). The total task cost is £4.41. If the firm bills £50 for this deliverable, the margin is healthy. If it bills £5, the margin is thin.

Per-task attribution is essential for cost-per-task analysis and client billing. Without it, firms cannot determine whether individual engagements are profitable.

Linking the Two Views

Every agent invocation gets tagged with both an agent ID and a task ID. The same data point appears in both views — contributing to the agent’s total cost and to the task’s total cost. The key is tagging at invocation time, not reconstructing attribution after the fact.

How Do You Handle Shared and Infrastructure Costs?

Not all costs belong to a single agent or task. Shared costs require a deliberate allocation method.

What Counts as Shared?

Shared costs include: vector database hosting (used by all agents for retrieval), embedding generation (shared embedding pipeline), tool API subscriptions (third-party services used by multiple agents), orchestration platform fees (the system that routes and manages agents), and monitoring infrastructure (dashboards, logging, alerting).

In a typical multi-agent deployment, 70-85% of costs are directly attributable to specific agents (model inference, tool calls). The remaining 15-30% is shared infrastructure.

Allocation Methods

Four methods exist. Each has trade-offs.

Equal split: Divide shared costs equally across all agents. Simple but inaccurate — a high-volume agent pays the same as a rarely used one.

Usage-proportional: Allocate based on each agent’s share of total usage (by invocation count, token volume, or compute time). The most common and fairest method. If an agent accounts for 40% of total invocations, it absorbs 40% of shared costs.

Revenue-proportional: Allocate based on the revenue each agent’s work generates. Useful for client-facing reporting but requires revenue tracking per agent.

Fixed overhead percentage: Add a flat percentage (e.g., 20%) to each agent’s direct costs to cover shared infrastructure. Simple to implement and predictable, but accuracy depends on the percentage being well calibrated.

The approach that works for most firms: usage-proportional allocation for shared costs, reviewed quarterly. It is fair, defensible, and straightforward to calculate.

How Do Agent-to-Agent Cost Chains Work?

Modern agent architectures involve agents calling other agents. This creates cost chains that must be tracked without double-counting or losing attribution.

The Chain Problem

An orchestrator agent receives a client task. It delegates to a research agent. The research agent calls a data extraction agent for structured data. The data extraction agent returns results to the research agent, which synthesises them and returns to the orchestrator. The orchestrator passes the output to a formatting agent for final delivery.

One client task. Four agents. Four separate cost entries. The total task cost is the sum of all four. But each agent’s cost is also part of its own monthly total for per-agent reporting.

Trace IDs

The mechanism for linking these costs is a trace ID — a unique identifier assigned when the client task begins. Every agent invocation within that task carries the same trace ID. Cost queries can then aggregate by trace ID (task view) or by agent ID (agent view) without modifying the underlying data.

Keito’s agent integration implements this as session correlation: each agent records its time with source=agent and logs token spend as LLM usage expenses via the API v2 and Node/Python SDKs, with the session linking every cost in the chain back to the originating client task.

Avoiding Double-Counting

Double-counting happens when shared resources used by the chain are allocated to each agent in the chain individually. If the research agent and the data extraction agent both use the same vector database query, the query cost should be counted once and attributed to the task — not once to each agent.

The rule: direct costs are per-agent. Shared resources triggered within a chain are per-task, allocated once. For firms using monitoring dashboards, trace-level cost views show exactly how costs flow through agent chains.

What Should a Fleet-Level Cost Dashboard Show?

A fleet dashboard gives leadership and finance teams the visibility they need to manage AI spend at scale.

Essential Views

Total fleet spend. Daily, weekly, and monthly totals. Trend line showing whether spend is growing, stable, or declining. Comparison against budget.

Per-agent breakdown. Bar chart or table showing each agent’s contribution to total spend. Identifies the most and least expensive agents at a glance.

Per-client breakdown. Cost attributed to each client. Essential for profitability analysis and billing. Flags clients whose AI costs are disproportionate to their revenue.

Cost trends. Line charts showing per-agent and per-client cost trends over 30, 60, and 90 days. Rising costs need investigation — higher workload, model price changes, or inefficient configurations.

Anomaly detection. Automatic flagging of unusual cost spikes. An agent that normally costs £50/day suddenly costs £300/day. A client project that normally uses one agent suddenly uses four. These anomalies need attention before they become budget problems.

Alert Thresholds

Set alerts at three levels. Per-agent daily limits — flag when any single agent exceeds its expected daily cost by more than 50%. Per-client budget caps — alert when a client’s cumulative AI spend approaches the budgeted amount. Fleet-wide monthly ceiling — escalate when total fleet spend reaches 80% of the monthly budget.

Reporting

Automated reports for three audiences. Finance: monthly fleet cost summary with trend analysis and budget variance. Project managers: per-client and per-project cost breakdowns for profitability tracking. Clients: cost transparency reports showing AI usage and associated charges, supporting the billing model agreed in the engagement terms.

How Do You Get Started with Multi-Agent Cost Tracking?

Start with direct cost attribution. Tag every agent invocation with an agent ID, task ID, and client ID from day one. This single step makes both per-agent and per-task reporting possible without retroactive data cleaning.

Week 1-2: Instrument your agents. Add cost logging to each agent — model inference costs per invocation, tool costs per call, and execution timestamps. Most orchestration frameworks support this natively or through middleware hooks. If you are building custom agents, add a cost logger that records token consumption and API call charges per invocation.

Week 3-4: Build aggregation views. Create two simple reports: total cost per agent per day, and total cost per task. These two views immediately reveal which agents and which work types are expensive. Compare against your expectations and billing rates.

Month 2: Add shared cost allocation. Calculate your monthly shared infrastructure costs. Choose an allocation method — usage-proportional works for most firms. Distribute shared costs across agents based on their share of total invocations. Add the allocated overhead to each agent’s direct costs.

Month 3: Add trace-level tracking. If you run multi-agent workflows, implement trace IDs so that costs flow correctly through agent chains. This is essential before you scale to more complex orchestrations.

Ongoing: Review and refine. Monthly cost reviews with finance and operations teams. Quarterly rate card updates based on actual cost data. Annual budget planning using 12 months of fleet cost trends.

The firms that struggle with multi-agent costs are the ones that deployed agents without cost instrumentation and tried to reconstruct attribution after the fact. Starting with cost tagging from the first invocation saves months of forensic accounting later.

Keito tracks costs across your entire AI agent fleet — per agent, per task, per client — with dashboards that show exactly where your spend is going.

Frequently Asked Questions

What is multi-agent cost tracking?

Multi-agent cost tracking is the process of attributing AI costs to individual agents, tasks, and clients when multiple agents operate simultaneously. It goes beyond total spend visibility to show which agents are expensive, which tasks cost the most, and which clients drive the highest AI spend.

How do you track costs across multiple AI agents?

Tag every agent invocation with an agent ID, task ID, and client ID at the point of invocation. Record model inference costs, tool costs, and execution metadata per invocation. Aggregate by agent for fleet management, by task for billing, and by client for profitability analysis.

What is per-agent vs per-task cost attribution?

Per-agent attribution totals the cost incurred by each agent over a period — useful for fleet management and optimisation. Per-task attribution totals the cost of completing a specific piece of work, potentially across multiple agents — useful for billing and pricing decisions. Both views use the same underlying data, segmented differently.

How do you handle shared costs in a multi-agent system?

Shared costs (vector databases, orchestration platforms, monitoring) are allocated across agents using a chosen method: equal split, usage-proportional, revenue-proportional, or fixed overhead percentage. Usage-proportional allocation is the most common approach — each agent absorbs shared costs in proportion to its share of total usage.

What should an AI agent fleet cost dashboard show?

A fleet dashboard should display total fleet spend with trends, per-agent cost breakdowns, per-client cost attribution, cost anomaly detection, and budget compliance. It should support daily, weekly, and monthly views with alert thresholds for cost spikes.

How do agent-to-agent cost chains work?

When agents call other agents within a workflow, each invocation carries a trace ID linking it to the originating task. Costs roll up by trace ID for task-level totals and by agent ID for fleet-level totals. The trace ID prevents double-counting of shared resources used within the chain.