Concurrency refers to the number of API requests you can have in progress (or running) simultaneously. If your plan supports 10 concurrent requests, you can process up to 10 requests at the same time. You’ll get a rate limit error if you send an 11th request while 10 are already processing.Think of concurrency like a team of workers in an office. Each worker represents a “concurrent request slot.” If you have 10 workers, you can assign them 10 tasks (requests) simultaneously. If you try to assign an 11th task while all workers are occupied, you’ll need to wait until one worker finishes.In cloro, each “task” is an API request to an AI model, and each “worker” is a concurrent request slot available based on your subscription.
cloro uses two different types of limits depending on the endpoint type:
Limit type
Endpoints affected
How it works
Rate limits
All endpoints (/v1/*)
500 requests per second shared across all endpoints
Concurrency limits
Monitor endpoints (/v1/monitor/*)
Based on your subscription plan (simultaneous requests)
Rate limits restrict how many requests you can make per second, while concurrency limits restrict how many requests can be processing simultaneously. Monitor endpoints are subject to both rate limits (500/sec) and concurrency limits (subscription-based).
All endpoints include rate limit headers in each response:
Header
Description
X-RateLimit-Limit
Maximum requests per second allowed (500)
X-RateLimit-Remaining
Remaining requests available in this second
For example, if you make a request:
X-RateLimit-Limit: 500X-RateLimit-Remaining: 499
This means you can make 499 more requests in the current second before hitting the rate limit. The counter resets every second.
Need higher concurrency? Self-serve plans can be upgraded in the dashboard and the new limit applies immediately. Enterprise customers should email support@cloro.dev to upgrade their existing plan.
For large-scale processing, submit tasks and handle results via webhooks. You don’t need to send requests in batches; cloro handles concurrency automatically. Send API requests for all your tasks concurrently (one request per task):
A 429 error means you’re hitting rate limits. This can happen for two reasons:Concurrency limit exceeded (monitor endpoints only)You’re making too many simultaneous requests beyond your plan’s concurrent request limit.Solution:
Check your current usage with response headers: X-Concurrent-Limit, X-Concurrent-Current, X-Concurrent-Remaining
Yes. Self-serve plans can be upgraded directly in the dashboard — the new limit applies immediately, no support ticket required. If you need concurrency above the highest self-serve tier, email support@cloro.dev for an enterprise quote.
Does higher concurrency delay my logs or dashboards?
No. Dashboard log ingestion runs independently from request processing. If logs look delayed during heavy load, the cause is usually batching on the dashboard side, not concurrency — entries normally surface within a minute.
No. The limit is hard — the (N+1)th simultaneous request gets a 429 immediately rather than queueing. Use the async API if you want cloro to handle queueing for you instead of managing burst capacity yourself.