Overview
Rotavision applies rate limits to ensure fair usage and platform stability. Limits are applied per API key and vary by plan.Rate Limit Tiers
| Plan | Requests/min | Requests/day | Concurrent |
|---|---|---|---|
| Free | 20 | 500 | 2 |
| Starter | 60 | 5,000 | 5 |
| Growth | 600 | 50,000 | 20 |
| Enterprise | 3,000 | 500,000 | 100 |
| Custom | Unlimited | Unlimited | Custom |
Enterprise and Custom plans can request higher limits. Contact [email protected].
Product-Specific Limits
Some products have additional limits beyond the base rate:Vishwas (Fairness Analysis)
| Operation | Limit | Notes |
|---|---|---|
analyze | 100/hour | Per model_id |
explain | 1,000/hour | Real-time explanations |
generate_report | 20/hour | PDF generation |
Guardian (Monitoring)
| Operation | Limit | Notes |
|---|---|---|
log_inference | 10,000/min | High-throughput logging |
create_monitor | 100/day | Monitor creation |
get_alerts | 600/min | Alert retrieval |
Dastavez (Document AI)
| Operation | Limit | Notes |
|---|---|---|
extract | 100/min | Document extraction |
create_agent | 20/hour | Browser agent creation |
| File size | 50 MB | Per document |
Sankalp (LLM Gateway)
| Operation | Limit | Notes |
|---|---|---|
proxy | Plan limit | Passthrough to LLM provider |
| Token throughput | Plan-based | Input + output tokens |
Orchestrate (Workflows)
| Operation | Limit | Notes |
|---|---|---|
create_workflow | 50/hour | Workflow definitions |
run_workflow | 500/hour | Workflow executions |
| Concurrent runs | 10-100 | Plan-based |
Gati (Fleet Intelligence)
| Operation | Limit | Notes |
|---|---|---|
optimize_routes | 100/hour | Route optimization |
track_fleet | 10,000/min | Vehicle tracking |
| Vehicles per request | 1,000 | Route optimization |
Rate Limit Headers
Every API response includes rate limit information:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Handling Rate Limits
When you exceed a rate limit, you’ll receive a429 Too Many Requests response:
Retry-After header indicating when to retry:
Recommended Retry Strategy
Best Practices
Implement exponential backoff
Implement exponential backoff
Don’t retry immediately after a rate limit. Use exponential backoff with jitter to avoid thundering herd.
Cache responses when possible
Cache responses when possible
Cache analysis results and explanations that don’t change frequently to reduce API calls.
Use batch endpoints
Use batch endpoints
For Guardian logging, use batch endpoints to send multiple inferences in one request.
Monitor your usage
Monitor your usage
Track your rate limit headers and set up alerts before hitting limits.
Use webhooks instead of polling
Use webhooks instead of polling
For async operations, use webhooks instead of polling status endpoints.
Quota Management
Beyond rate limits, some resources have monthly quotas:| Resource | Starter | Growth | Enterprise |
|---|---|---|---|
| Documents processed | 1,000 | 10,000 | 100,000+ |
| LLM tokens (Sankalp) | 1M | 10M | 100M+ |
| Storage (GB) | 10 | 100 | 1,000+ |
| Monitors | 5 | 25 | Unlimited |
Requesting Higher Limits
If you need higher rate limits:- Growth Plan: Upgrade via dashboard for 10x limits
- Enterprise Plan: Contact sales for custom limits
- Temporary Increase: Contact support for short-term increases during migrations or load tests

