Skip to main content

Overview

KavachOS applies rate limits at two layers: auth endpoints get IP-based limits to protect against brute force and credential stuffing, and the permission engine enforces per-agent call limits via the maxCallsPerHour constraint. Both layers return a standard 429 response and set Retry-After when applicable.

Built-in auth endpoint limits

These limits apply automatically with no configuration required.
EndpointLimitWindow
POST /sign-in10 requestsper IP per minute
POST /sign-up5 requestsper IP per minute
POST /magic-link3 requestsper IP per minute
POST /email-otp5 requestsper IP per minute
POST /totp/verify10 requestsper IP per minute
POST /token (OAuth)20 requestsper IP per minute
Limits are tracked in-process by default. For multi-instance deployments, configure a Redis store so all instances share the same counters.

Configuring auth limits

Pass a rateLimit block to createKavach to override the defaults or enable Redis:
import { createKavach } from 'kavachos';

const kavach = await createKavach({
  database: { provider: 'postgres', url: process.env.DATABASE_URL },
  rateLimit: {
    store: 'memory',        // or 'redis'
    redisUrl: process.env.REDIS_URL,
    endpoints: {
      signIn: { limit: 5, window: 60 },    // 5 per minute (stricter)
      signUp: { limit: 2, window: 60 },
      magicLink: { limit: 2, window: 60 },
    },
  },
});
store
"memory" | "redis"
default:"\"memory\""
Where to persist counters. Use redis in production when running multiple instances.
redisUrl
string | undefined
Redis connection string. Required when store is redis.
endpoints.signIn
{ limit: number; window: number }
Override for the sign-in endpoint. window is in seconds.
endpoints.signUp
{ limit: number; window: number }
Override for the sign-up endpoint.

Per-agent limits with maxCallsPerHour

The permission engine supports a maxCallsPerHour constraint. When an agent exceeds its hourly call budget, the permission check returns allowed: false with reason: "Rate limit exceeded".
const agent = await kavach.agent.create({
  ownerId: 'user-123',
  name: 'data-sync-bot',
  type: 'autonomous',
  permissions: [
    {
      resource: 'db:reports:*',
      actions: ['read'],
      constraints: {
        maxCallsPerHour: 100,
      },
    },
    {
      resource: 'db:reports:*',
      actions: ['export'],
      constraints: {
        maxCallsPerHour: 10,  // stricter for expensive operations
      },
    },
  ],
});
The counter resets at the top of each clock hour. If you need sliding windows, use createRateLimiter instead (see below).

Checking the limit in your code

const result = await kavach.authorize(agent.id, {
  action: 'read',
  resource: 'db:reports:monthly',
});

if (!result.allowed) {
  if (result.reason === 'Rate limit exceeded') {
    // Return 429 to the calling agent
    return new Response('Too many requests', {
      status: 429,
      headers: { 'Retry-After': '3600' },
    });
  }
}

createRateLimiter

Use createRateLimiter for custom rate limiting on your own endpoints, for example, an expensive AI inference route that should be capped per user.
import { createRateLimiter } from 'kavachos';

const inferenceLimit = createRateLimiter({
  limit: 20,
  window: 3600,           // 1 hour in seconds
  keyFn: (req) => req.headers.get('x-user-id') ?? req.ip ?? 'anonymous',
  store: 'redis',
  redisUrl: process.env.REDIS_URL,
});
limit
number
Maximum number of requests allowed within the window.
window
number
Time window in seconds.
keyFn
(req: Request) => string
Derives the rate limit key from the request. Defaults to the client IP.
store
"memory" | "redis"
default:"\"memory\""
Counter storage backend.
redisUrl
string | undefined
Redis connection string. Required when store is redis.

withRateLimit middleware

Wrap any handler with withRateLimit to apply a limiter without modifying the handler itself.
import { withRateLimit } from 'kavachos';

const inferenceLimit = createRateLimiter({
  limit: 20,
  window: 3600,
  keyFn: (req) => req.headers.get('x-user-id') ?? 'anonymous',
});

// Works with any framework adapter — handler receives a standard Request
export const POST = withRateLimit(inferenceLimit, async (req: Request) => {
  const result = await runInference(await req.json());
  return Response.json(result);
});
When the limit is exceeded, withRateLimit returns a 429 response automatically and does not call the wrapped handler.

429 response format

All rate limit rejections, from auth endpoints, the permission engine, or withRateLimit, return the same shape:
{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Too many requests. Try again after 60 seconds.",
    "details": {
      "limit": 10,
      "window": 60,
      "retryAfter": 60
    }
  }
}
The Retry-After header is also set to the number of seconds remaining in the current window.
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 42

Next steps

Permissions

Add maxCallsPerHour and other constraints to individual permissions.

Approval flows

Pair rate limits with human-in-the-loop approval for sensitive actions.

Anomaly detection

Use anomaly scoring alongside rate limits for behavioural signals.