The problem policies solve
When agents make LLM calls, costs accumulate in the background. Without limits, a single runaway agent or a misconfigured loop can burn through a month’s budget in hours. Budget policies let you set hard caps and choose what happens when those caps are hit. Policies are evaluated at authorization time, before any LLM call is made. If an agent is over budget, the authorization check fails before your code even runs.Policies stack. An agent can have a per-agent policy, a per-user policy, and a per-tenant policy all active at once. KavachOS evaluates all of them and returns the first one that is exceeded.
Data model
Stable identifier with a pol_ prefix.
Agent this policy applies to. Omit to create a global policy that applies to all agents.
User this policy applies to. Omit to apply regardless of owner.
Tenant this policy applies to.
The numeric thresholds for this policy.
What happens when a limit is exceeded.
Current policy state. ‘triggered’ means a limit has been hit.
Running counters for this policy.
BudgetLimits
Maximum token cost units allowed per day. Resets at midnight UTC.
Maximum token cost units allowed per calendar month.
Maximum authorize() calls allowed per day.
Maximum authorize() calls allowed per calendar month.
Actions
| Action | What happens |
|---|---|
warn | Authorization still succeeds. The policy is marked as triggered. |
throttle | Authorization fails. The agent must wait for the reset cycle. |
block | Authorization fails. Stays blocked until the limit resets or you reset manually. |
revoke | Authorization fails. The agent’s token is revoked immediately. |
warn is useful for sending alerts before you start blocking. Set a warn policy at 80% of your limit and a block policy at 100%.
Creating a policy
Checking a budget before a call
CallcheckBudget with a speculative tokensCost to see whether this call would exceed any policy. The cost is included in the check but not recorded yet.
checkBudget evaluates all active policies for the agent (both exact-match and global). If any policy is exceeded, it returns the first violation and stops.
Recording usage after a call
CallrecordUsage after the LLM call completes to update the counters.
recordUsage increments callsToday, callsThisMonth, tokensCostToday, and tokensCostThisMonth on every active policy that applies to this agent. It also transitions any policy from active to triggered if the new totals cross a threshold.
Resetting counters
Reset daily counters on a UTC midnight cron job:triggered are automatically moved back to active if the new totals are within limits.
Listing and updating policies
Combining warn and block
A common pattern: warn at a soft limit, block at the hard limit.warn policy triggers and you can send an alert (via a lifecycle hook). At 1000, the block policy triggers and requests stop.
Next steps
Lifecycle hooks
Fire custom logic when a policy is triggered.
Cost tracking
Aggregate token costs for billing reports.
Multi-tenant isolation
Attach policies to entire tenants.