API Request Limits
These limits control how fast you can submit requests to the API:| Endpoint | Limit | Notes |
|---|---|---|
POST /v1/score | 1,000 req/min | Supports high-volume batch imports |
POST /v1/criteria/generate | 60 req/min | Typically once per job posting |
POST /v1/criteria/questions | 60 req/min | Typically once per job posting |
GET /v1/score/{scoringJobId} | 1,000 req/min | Polling fallback |
GET /v1/jobs/{jobId}/criteria | 100 req/min | Criteria lookup |
POST/PATCH /v1/jobs/{jobId}/criteria | 60 req/min | Criteria updates |
POST /v1/jobs/{jobId}/criteria/archive | 60 req/min | Criteria archival |
API rate limits control how fast you can submit requests. Processing capacity (below) determines how fast jobs are completed.
Processing Capacity & Concurrency
Scoring requests are processed asynchronously. When you submit a score request, it’s queued and processed by our worker infrastructure.Concurrency Limits
| Limit Type | Default | Description |
|---|---|---|
| Concurrency | 500 | Maximum concurrent scoring jobs |
Need higher limits? Contact us through the Embed Portal to discuss enterprise capacity options.
Understanding Throughput
Each scoring job takes approximately 30 seconds to complete. Here’s how concurrency translates to throughput:Throughput formula: Jobs per minute = (Concurrency × 60s) ÷ Processing timeAt 500 concurrent workers: 500 × 60 ÷ 30 = 1,000 jobs/minute = 60,000 jobs/hour
Capacity Planning
Use this table to plan your integration based on expected daily volume:| Daily Volume | If Spread Evenly | Peak Hour (50% of daily) | Concurrency Needed |
|---|---|---|---|
| 50,000 | 2,083/hr | 25,000/hr | 208 |
| 80,000 | 3,333/hr | 40,000/hr | 334 |
| 150,000 | 6,250/hr | 75,000/hr | 625 (contact us) |
| 200,000+ | 8,333/hr | 100,000/hr | Contact us |
Default limit of 500 concurrent jobs comfortably supports 80,000+ applications per day even with concentrated peak hours.
Queue Behavior
When you submit more jobs than can be processed immediately, they’re queued:1
Immediate start
If workers are available, your job starts processing immediately—no queue wait.
2
Queue when busy
If all 500 workers are occupied, new jobs wait in a FIFO (first-in, first-out) queue.
3
Continuous processing
As each worker completes (~30s), it picks up the next queued job automatically.
4
Webhook delivery
Results are sent via webhook as soon as each job completes—you don’t wait for batches.
Estimating Wait Time
When the system is under load, estimate wait time with this formula:Wait time ≈ (queue position ÷ concurrency) × processing time
- Your queue position: 251st
- Wait time: (251 ÷ 500) × 30s ≈ 15 seconds
Steady-State Throughput
At full concurrency, the system processes jobs continuously:Response Headers
Every response includes rate limit information:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed per minute |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when limit resets |
Checking Rate Limit Status
Use the dedicated status endpoint to check your current rate limit usage across all categories without consuming any rate limit points:| Field | Description |
|---|---|
category | Machine-readable category identifier |
displayName | Human-readable name for display in dashboards |
endpoints | List of endpoint patterns that count against this limit |
limit | Maximum requests allowed per window |
used | Requests consumed in current window |
remaining | Requests remaining in current window |
resetAt | Unix timestamp when the window resets (0 if no requests made yet) |
windowSeconds | Window duration in seconds |
This endpoint is useful for building dashboards that display your current API usage, or for proactively throttling requests before hitting limits. The
endpoints array shows exactly which API calls count against each category’s limit.Handling 429 Responses
When rate limited, you’ll receive:- HTTP
429 Too Many Requests Retry-Afterheader with seconds until reset
Recommended Client Behavior
1
Check Headers
Monitor
X-RateLimit-Remaining to anticipate limits before hitting them.2
Respect Retry-After
When you receive a 429, wait for the
Retry-After duration before retrying.3
Use Exponential Backoff
If retries continue to fail, increase wait times exponentially with jitter.
4
Queue Requests
Implement client-side request queuing rather than tight retry loops.
Implementation Example
Fair Use
Criteria and questions generation are included in your per-application pricing:| Endpoint | Expected Usage |
|---|---|
POST /v1/criteria/generate | Once per job |
POST /v1/criteria/questions | Once per job |
POST/PATCH /v1/jobs/{jobId}/criteria | Anytime (no AI cost) |
Batch Processing Best Practices
When processing large batches of candidates:Submit at a steady rate
Submit at a steady rate
Rather than submitting 10,000 candidates instantly, spread submissions over time. At 1,000 req/min limit, you can submit ~16 per second sustainably.
Trust the queue
Trust the queue
Our queue handles bursts gracefully. You don’t need to throttle submissions to match processing speed—just stay within API rate limits.
Use webhooks for results
Use webhooks for results
Don’t poll for results. Webhooks deliver scores as soon as processing completes, regardless of queue position.
Monitor queue depth
Monitor queue depth
If you’re submitting faster than processing, jobs will queue. This is expected and normal—jobs are processed in order.