Skip to main content
Rate limits protect the API from abuse while ensuring fair access for all integrators. This guide covers both API request limits and processing capacity.

API Request Limits

These limits control how fast you can submit requests to the API:
EndpointLimitNotes
POST /v1/score1,000 req/minSupports high-volume batch imports
POST /v1/criteria/generate60 req/minTypically once per job posting
POST /v1/criteria/questions60 req/minTypically once per job posting
GET /v1/score/{scoringJobId}1,000 req/minPolling fallback
GET /v1/jobs/{jobId}/criteria100 req/minCriteria lookup
POST/PATCH /v1/jobs/{jobId}/criteria60 req/minCriteria updates
POST /v1/jobs/{jobId}/criteria/archive60 req/minCriteria archival
API rate limits control how fast you can submit requests. Processing capacity (below) determines how fast jobs are completed.

Processing Capacity & Concurrency

Scoring requests are processed asynchronously. When you submit a score request, it’s queued and processed by our worker infrastructure.

Concurrency Limits

Limit TypeDefaultDescription
Concurrency500Maximum concurrent scoring jobs
Need higher limits? Contact us through the Embed Portal to discuss enterprise capacity options.

Understanding Throughput

Each scoring job takes approximately 30 seconds to complete. Here’s how concurrency translates to throughput:
Throughput formula: Jobs per minute = (Concurrency × 60s) ÷ Processing timeAt 500 concurrent workers: 500 × 60 ÷ 30 = 1,000 jobs/minute = 60,000 jobs/hour

Capacity Planning

Use this table to plan your integration based on expected daily volume:
Daily VolumeIf Spread EvenlyPeak Hour (50% of daily)Concurrency Needed
50,0002,083/hr25,000/hr208
80,0003,333/hr40,000/hr334
150,0006,250/hr75,000/hr625 (contact us)
200,000+8,333/hr100,000/hrContact us
Default limit of 500 concurrent jobs comfortably supports 80,000+ applications per day even with concentrated peak hours.

Queue Behavior

When you submit more jobs than can be processed immediately, they’re queued:
1

Immediate start

If workers are available, your job starts processing immediately—no queue wait.
2

Queue when busy

If all 500 workers are occupied, new jobs wait in a FIFO (first-in, first-out) queue.
3

Continuous processing

As each worker completes (~30s), it picks up the next queued job automatically.
4

Webhook delivery

Results are sent via webhook as soon as each job completes—you don’t wait for batches.

Estimating Wait Time

When the system is under load, estimate wait time with this formula:
Wait time ≈ (queue position ÷ concurrency) × processing time
Example: You submit job #750 when 500 workers are busy and 250 jobs are already queued:
  • Your queue position: 251st
  • Wait time: (251 ÷ 500) × 30s ≈ 15 seconds

Steady-State Throughput

At full concurrency, the system processes jobs continuously:
Jobs are processed FIFO—first submitted, first scored. Queue depth is effectively unlimited for normal operations.

Response Headers

Every response includes rate limit information:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1705312800
HeaderDescription
X-RateLimit-LimitMaximum requests allowed per minute
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when limit resets

Checking Rate Limit Status

Use the dedicated status endpoint to check your current rate limit usage across all categories without consuming any rate limit points:
curl -X GET https://embed.nova.dweet.com/v1/rate-limit/status \
  -H "Authorization: Bearer $NOVA_API_KEY"
Response:
{
  "categories": [
    {
      "category": "scoring",
      "displayName": "Scoring",
      "endpoints": [
        "POST /v1/score",
        "POST /v1/score/batch",
        "GET /v1/score/{scoringJobId}",
        "GET /v1/score/application/{applicationId}"
      ],
      "limit": 1000,
      "used": 153,
      "remaining": 847,
      "resetAt": 1705312800,
      "windowSeconds": 60
    },
    {
      "category": "criteria_generation",
      "displayName": "Criteria Generation",
      "endpoints": [
        "POST /v1/criteria/generate",
        "POST /v1/criteria/questions"
      ],
      "limit": 60,
      "used": 2,
      "remaining": 58,
      "resetAt": 1705312800,
      "windowSeconds": 60
    },
    {
      "category": "criteria_operations",
      "displayName": "Criteria Operations",
      "endpoints": [
        "GET /v1/criteria/{jobId}",
        "POST /v1/criteria/{jobId}",
        "PATCH /v1/criteria/{jobId}/{criterionId}",
        "DELETE /v1/criteria/{jobId}",
        "DELETE /v1/criteria/{jobId}/{criterionId}"
      ],
      "limit": 100,
      "used": 0,
      "remaining": 100,
      "resetAt": 0,
      "windowSeconds": 60
    }
  ],
  "timestamp": "2025-01-15T10:30:00Z"
}
FieldDescription
categoryMachine-readable category identifier
displayNameHuman-readable name for display in dashboards
endpointsList of endpoint patterns that count against this limit
limitMaximum requests allowed per window
usedRequests consumed in current window
remainingRequests remaining in current window
resetAtUnix timestamp when the window resets (0 if no requests made yet)
windowSecondsWindow duration in seconds
This endpoint is useful for building dashboards that display your current API usage, or for proactively throttling requests before hitting limits. The endpoints array shows exactly which API calls count against each category’s limit.

Handling 429 Responses

When rate limited, you’ll receive:
  • HTTP 429 Too Many Requests
  • Retry-After header with seconds until reset
{
  "error": {
    "type": "https://embed.nova.dweet.com/errors/rate-limited",
    "code": "RATE_LIMITED",
    "status": 429,
    "message": "Rate limit exceeded. Please retry after the specified delay.",
    "requestId": "req_abc123def456"
  }
}
1

Check Headers

Monitor X-RateLimit-Remaining to anticipate limits before hitting them.
2

Respect Retry-After

When you receive a 429, wait for the Retry-After duration before retrying.
3

Use Exponential Backoff

If retries continue to fail, increase wait times exponentially with jitter.
4

Queue Requests

Implement client-side request queuing rather than tight retry loops.

Implementation Example

async function callWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      const jitter = Math.random() * 1000; // Add jitter to prevent thundering herd

      console.log(`Rate limited. Retrying in ${retryAfter}s...`);
      await sleep(retryAfter * 1000 + jitter);
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Fair Use

Criteria and questions generation are included in your per-application pricing:
EndpointExpected Usage
POST /v1/criteria/generateOnce per job
POST /v1/criteria/questionsOnce per job
POST/PATCH /v1/jobs/{jobId}/criteriaAnytime (no AI cost)
We don’t enforce hard quotas, but may reach out if we see unusual patterns (e.g., regeneration loops, generating criteria without scoring). Contact us if you have an atypical use case.

Batch Processing Best Practices

When processing large batches of candidates:
Rather than submitting 10,000 candidates instantly, spread submissions over time. At 1,000 req/min limit, you can submit ~16 per second sustainably.
Our queue handles bursts gracefully. You don’t need to throttle submissions to match processing speed—just stay within API rate limits.
Don’t poll for results. Webhooks deliver scores as soon as processing completes, regardless of queue position.
If you’re submitting faster than processing, jobs will queue. This is expected and normal—jobs are processed in order.

Example: High-Volume Integration

class ScoringClient {
  constructor({ apiKey, maxConcurrent = 50 }) {
    this.apiKey = apiKey;
    this.pending = 0;
    this.maxConcurrent = maxConcurrent; // Client-side concurrency
  }

  async scoreCandidate(payload) {
    // Wait if we have too many pending requests (client-side throttle)
    while (this.pending >= this.maxConcurrent) {
      await sleep(100);
    }

    this.pending++;
    try {
      const response = await fetch('https://embed.nova.dweet.com/v1/score', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${this.apiKey}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify(payload),
      });

      if (response.status === 429) {
        const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
        await sleep(retryAfter * 1000);
        return this.scoreCandidate(payload); // Retry
      }

      return response.json();
    } finally {
      this.pending--;
    }
  }

  async scoreBatch(candidates, onResult) {
    const promises = candidates.map(async (candidate) => {
      const result = await this.scoreCandidate(candidate);
      onResult?.(result);
      return result;
    });

    return Promise.all(promises);
  }
}

// Usage: Process 80,000 candidates
const client = new ScoringClient({
  apiKey: 'sk_live_xxx',
  maxConcurrent: 50  // Stay well under API rate limits
});

// Submit all candidates - they'll queue server-side
// Results arrive via webhook as processing completes
await client.scoreBatch(candidates, (result) => {
  console.log(`Queued: ${result.scoringJobId}`);
});