AI Agent Cost Tracking: The Complete Guide for Professional Services Firms

Q: What tools are available for tracking AI agent costs?

Options range from LLM observability platforms for trace-level cost tracking to general monitoring tools with AI cost plugins to purpose-built platforms like Keito that track both human time and AI agent costs in a single system. The right choice depends on firm size, agent complexity, and integration requirements.

AI agent cost tracking is the practice of monitoring, attributing, and reporting every pound a firm spends on AI agent operations — tokens, inference, tool calls, and compute — per client, project, and task.

Professional services firms are adopting AI agents at pace. According to Thomson Reuters’ 2026 AI in Professional Services Report, 53% of firms plan to deploy agentic AI by 2027. Yet most have no visibility into what those agents actually cost. The result: blown budgets, mispriced client work, and AI projects that get killed before they deliver value. Research from Gartner projects that 40% of agentic AI projects will be cancelled by 2027, primarily due to escalating costs and unclear ROI.

This guide covers every aspect of AI agent cost tracking for firms that bill clients for work done by both people and autonomous agents.

What Is AI Agent Cost Tracking?

AI agent cost tracking means capturing every cost generated when an AI agent executes a task — and attributing that cost to a specific client, project, or billing code. It is the financial layer that sits between your AI infrastructure and your invoicing system.

For professional services firms, this is different from general AI cost management. A technology company tracks AI costs as a line item on its infrastructure budget. A law firm, consultancy, or agency tracks AI costs because those costs need to flow through to client bills, project profitability reports, and matter-level accounting.

The distinction matters because professional services cost tracking requires attribution — not just “how much did we spend?” but “how much did we spend on Client A’s restructuring project versus Client B’s compliance review?”

According to Deloitte’s 2026 State of AI in the Enterprise report, 84% of firms have not redesigned their jobs or workflows around AI. Agents are running alongside human teams, but the financial infrastructure to track their costs has not caught up.

Why General Cost Monitoring Is Not Enough

Standard cloud monitoring tracks CPU usage, memory, and API call volumes. AI agent cost tracking goes further. It connects each API call to a business outcome: a client matter, a project deliverable, a billing code.

Without this connection, AI costs become unattributable overhead. The firm absorbs them. Margins shrink. And nobody can answer the question every partner eventually asks: “How much is this AI costing us per client?”

What Do AI Agents Actually Cost to Run?

The cost of running AI agents breaks down into six components. Most firms only track the first one.

Component	Typical % of Total	Monthly Range (Mid-Tier)
Token/API costs (input, output, reasoning)	30–40%	£800–£4,000
Compute and inference	15–20%	£160–£2,000
Embeddings and vector database	8–18%	£100–£2,200
Tool and API call fees	10–15%	£400–£1,600
Fine-tuning and model adaptation	10–15%	£400–£2,000
Monitoring and observability	3–5%	£160–£800
Total (5,000 tasks/month)	100%	£2,600–£10,400

Token costs are the most visible. Every request to a large language model consumes tokens — sub-word units priced per million. Output tokens typically cost three to five times more than input tokens. Reasoning models that “think” before responding can consume five to twenty times more tokens per request than standard models.

Inference and compute costs cover the GPU time needed to process requests. These vary by model size, provider, and whether you use real-time or batch processing.

Embedding and vector database costs apply when agents use retrieval-augmented generation (RAG) to search firm documents. Storing and querying vector embeddings incurs storage fees, query fees, and scaling costs.

Tool and API call fees accumulate when agents invoke external services — web searches, code execution sandboxes, database queries, or third-party data sources. A single agent task can trigger five to twenty tool calls.

The per-task cost benchmark for professional services sits at £0.08–£0.25 for routine tasks, rising to £1–£4 for complex multi-agent workflows involving deep research or code generation.

How to Set Up AI Agent Cost Tracking in Your Firm

Setting up cost tracking follows five steps. Each builds on the previous.

Step 1: Inventory Your AI Agents

Before you can track costs, you need to know what is running. Catalogue every AI agent and tool in use across the firm:

Coding agents (IDE assistants, terminal-based agents, code review agents)
Research agents (deep research workflows, document analysis)
Document processing agents (contract review, invoice processing, reconciliation)
Campaign and marketing agents (content generation, ad copy, social media)
Recruitment agents (candidate screening, outreach automation)
Workflow automation agents (data routing, scheduling, notifications)

Most firms discover agents they did not know existed. Teams adopt tools independently. A full inventory is the starting point.

Step 2: Establish a Cost Attribution Taxonomy

Define how costs will be categorised. The standard hierarchy for professional services is:

Client → Project → Task → Agent

Every agent action must carry these tags so that costs can be aggregated at any level. This taxonomy should mirror your existing billing code structure.

Step 3: Implement Real-Time Cost Monitoring

Instrument your agent frameworks to capture cost data at the API call level. For each request, log:

Tokens consumed (input, output, reasoning)
Model used and cost per token
Tool calls triggered and their individual costs
Total cost for the request
Client/project/task attribution tags

This data should stream in real time, not arrive as a monthly reconciliation.

Step 4: Set Budgets and Alerts

Assign spending limits per project, client, or department. Configure alerts at 50%, 75%, and 90% of budget. Decide whether to use hard caps (agents stop at the limit) or soft caps (notifications only).

Step 5: Build Reporting Workflows

Create reports that serve three audiences:

Finance teams need cost-per-client summaries for billing and profitability analysis
Project managers need real-time budget-versus-actual dashboards
Partners need monthly summaries showing AI cost trends and ROI indicators

How to Attribute AI Agent Costs to Clients and Projects

Attribution is the hardest part of AI agent cost tracking. When a research agent analyses documents for three different client matters in the same hour, or when Agent A triggers Agent B triggers Agent C in a chain, untangling who pays for what is non-trivial.

Direct Attribution

The gold standard. Every agent invocation is tagged with a client and project code at the point of invocation. The human who triggers the agent selects the client. The cost is attributed one-to-one.

This works well for supervised agents where a human initiates each task. It breaks down for autonomous agents that run without direct human invocation.

Proportional Allocation

Total AI costs for a period are distributed across clients based on usage volume. If Client A generated 60% of agent tasks and Client B generated 40%, costs split accordingly.

Simpler to implement than direct attribution, but less precise. Suitable for firms early in their AI adoption where costs are small relative to revenue.

Activity-Based Costing

Costs are allocated based on the type and complexity of work performed. A deep research task carries a higher cost allocation than a simple document summary, even if both take the same elapsed time.

This method requires well-defined task categories and cost benchmarks per category. It is the most accurate for firms running diverse agent workloads across clients.

Handling Shared Agent Costs

Some agents serve the entire firm — internal knowledge bases, firm-wide research tools, training environments. These costs should sit in a separate overhead category, not allocated to any single client. Treat them as you would shared office infrastructure.

How to Control AI Agent Costs with Budgets and Guardrails

AI agents operate differently from human workers in one critical respect: they have no natural spending limit. A human consultant might spend thirty minutes on a research task and move on. An agent will keep researching until it exhausts its context window or hits a token limit — whichever comes first.

Budget Structures

Per-project budgets: Fixed AI spend allocation within a project scope
Per-client budgets: Monthly or quarterly caps on total AI spend per client account
Per-agent budgets: Limits on individual agent types (e.g., research agents capped at £500/month)

Guardrails Against Cost Spirals

Consumption-based pricing makes AI costs inherently unpredictable. According to 0g.ai’s market infrastructure report, inference costs consume 60–80% of total AI operating expenses. Agentic reasoning, which requires five to twenty inferences per request, amplifies this further.

Three guardrails help:

Loop detection — Automatically halt agents caught in recursive cycles where Agent A keeps calling Agent B
Token limits per request — Cap the maximum tokens any single agent invocation can consume
Model routing — Use less expensive models for routine tasks and reserve expensive reasoning models for complex work

The goal is not to restrict agents but to create financial boundaries that substitute for the judgment humans naturally apply to their own work.

How to Report AI Costs and Bill Clients

Cost tracking only matters if it connects to billing and client reporting. Firms need to translate raw cost data into formats that finance teams can process and clients can understand.

Integrating with PSA and Billing Systems

AI agent cost data needs to flow into the same systems that handle human time entries. Most professional services firms use PSA platforms for time tracking, project management, and invoicing. AI agent costs should appear alongside human time entries — either as separate line items or as components of blended project costs.

The integration approach depends on your PSA platform. API-based integration is ideal: cost events stream from your agent framework into your PSA in real time. Batch import (daily or weekly CSV uploads) works as a starting point.

Keito closes this gap natively: agent sessions land as source=agent time entries next to human hours, LLM token costs are logged as expenses, and invoices group AI agent work automatically — with Xero and QuickBooks sync on the accounting side.

Client-Facing Cost Reports

Transparency builds trust. When reporting AI costs to clients, include:

What the agent did (task description in plain language)
How long it took (elapsed time and compute time)
What it cost (broken down by component if the client wants detail)
What it produced (deliverables, outputs, actions taken)

The level of detail depends on client expectations and the billing model. Time-based billing requires detailed activity logs. Outcome-based billing requires less granularity but more emphasis on results.

AI Agent Cost Tracking by Industry

Different professional services sectors face different cost tracking challenges.

Law Firms

Law firms must track AI costs at the matter level — every pound spent on AI must be attributable to a specific legal matter. Regulatory requirements in many jurisdictions demand detailed activity records. Research agent costs for case preparation, document review agent costs for discovery, and drafting agent costs for contract work all need separate tracking. The EU AI Act adds further logging obligations for AI systems used in legal contexts.

Consulting Firms

Consulting firms track at the project level. Deep research agents used for strategy work can generate significant token costs — a single competitive analysis might consume 200,000 tokens across multiple sources. Cost tracking must distinguish between research that directly benefits a client and research that builds the firm’s internal knowledge base.

Accounting and Audit Firms

Reconciliation agents and document processing agents are the primary cost drivers. These firms need audit trails that satisfy both internal quality standards and external regulatory requirements. Cost records must be immutable and retained for five to seven years.

Marketing, Creative, and Digital Agencies

Agencies track at the client account and campaign level. Campaign agents, content generation agents, and analytics agents each generate costs that need attribution to specific client accounts. The challenge is deliverable-level tracking: how much did the AI cost for this specific ad campaign versus that specific brand strategy?

IT Consultancies and MSPs

Coding agents are the dominant cost driver. Firms need to track agent costs per development task — per feature, per bug fix, per code review — and attribute them to the correct client contract. Infrastructure monitoring agents add a second cost layer. For local AI coding sessions, the Keito Agent Skill for Claude Code and Codex logs each session to the right client project per repository, and the Keito GitHub Action creates billable time entries from merged PRs, reviews, and issues.

Key Takeaway

AI agent cost tracking turns AI from uncontrolled overhead into a billable, manageable resource. Track every cost component, attribute it to clients, and connect it to billing.

Frequently Asked Questions

What is AI agent cost tracking?

AI agent cost tracking is the practice of monitoring every cost generated by AI agents — tokens, inference, tool calls, and compute — and attributing those costs to specific clients, projects, and tasks. It gives professional services firms the financial visibility needed for accurate billing and budget control.

How much do AI agents cost to run per month?

A mid-tier AI agent handling 5,000 tasks per month typically costs £2,600–£10,400. This includes token costs (30–40% of total), inference (15–20%), embeddings (8–18%), tool calls (10–15%), fine-tuning (10–15%), and monitoring (3–5%). Per-task costs average £0.08–£0.25 for routine work and £1–£4 for complex tasks.

How do professional services firms track AI agent costs?

Firms instrument their agent frameworks to capture cost data at the API call level. Each request is tagged with client, project, and task identifiers. Cost events stream into dashboards and PSA systems where they can be reported, budgeted, and billed.

What are the main cost components of running AI agents?

The six main components are: token/API costs (30–40%), compute and inference (15–20%), embeddings and vector databases (8–18%), tool and API call fees (10–15%), fine-tuning (10–15%), and monitoring (3–5%). Most firms only track token costs, missing 60–70% of the total.

How do you attribute AI agent costs to specific clients?

Three methods exist: direct attribution (tagging every agent action with a client code at invocation), proportional allocation (distributing total costs by usage volume), and activity-based costing (allocating by task type and complexity). Direct attribution is most accurate; proportional allocation is simplest to implement.

Can AI agent costs be included in client billing?

Yes. AI agent costs can appear as separate line items on invoices, be blended into hourly rates, or be included as a cost-plus fee (agent cost plus margin). The approach depends on the billing model and the client relationship. Transparency is key — clients should understand what they are paying for.

What tools are available for tracking AI agent costs?

Options range from LLM observability platforms (for trace-level cost tracking) to general monitoring tools (with AI cost plugins) to purpose-built platforms like Keito that track both human time and AI agent costs in a single system. The right choice depends on firm size, agent complexity, and integration requirements.