Overview
KavachOS applies rate limits at two layers: auth endpoints get IP-based limits to protect against brute force and credential stuffing, and the permission engine enforces per-agent call limits via themaxCallsPerHour constraint.
Both layers return a standard 429 response and set Retry-After when applicable.
Built-in auth endpoint limits
These limits apply automatically with no configuration required.| Endpoint | Limit | Window |
|---|---|---|
POST /sign-in | 10 requests | per IP per minute |
POST /sign-up | 5 requests | per IP per minute |
POST /magic-link | 3 requests | per IP per minute |
POST /email-otp | 5 requests | per IP per minute |
POST /totp/verify | 10 requests | per IP per minute |
POST /token (OAuth) | 20 requests | per IP per minute |
Limits are tracked in-process by default. For multi-instance deployments, configure a Redis store so all instances share the same counters.
Configuring auth limits
Pass arateLimit block to createKavach to override the defaults or enable Redis:
Where to persist counters. Use redis in production when running multiple instances.
Redis connection string. Required when store is redis.
Override for the sign-in endpoint. window is in seconds.
Override for the sign-up endpoint.
Per-agent limits with maxCallsPerHour
The permission engine supports amaxCallsPerHour constraint. When an agent exceeds its hourly call budget, the permission check returns allowed: false with reason: "Rate limit exceeded".
createRateLimiter instead (see below).
Checking the limit in your code
createRateLimiter
UsecreateRateLimiter for custom rate limiting on your own endpoints, for example, an expensive AI inference route that should be capped per user.
Maximum number of requests allowed within the window.
Time window in seconds.
Derives the rate limit key from the request. Defaults to the client IP.
Counter storage backend.
Redis connection string. Required when store is redis.
withRateLimit middleware
Wrap any handler withwithRateLimit to apply a limiter without modifying the handler itself.
withRateLimit returns a 429 response automatically and does not call the wrapped handler.
429 response format
All rate limit rejections, from auth endpoints, the permission engine, orwithRateLimit, return the same shape:
Retry-After header is also set to the number of seconds remaining in the current window.
Next steps
Permissions
Add maxCallsPerHour and other constraints to individual permissions.
Approval flows
Pair rate limits with human-in-the-loop approval for sensitive actions.
Anomaly detection
Use anomaly scoring alongside rate limits for behavioural signals.