Rate Limits

Rate limits protect the API from abuse while ensuring fair access for all integrators. This guide covers both API request limits and processing capacity.

API Request Limits

These limits control how fast you can submit requests to the API:

Endpoint	Limit	Notes
`POST /v1/score`	1,000 req/min	Supports high-volume batch imports
`POST /v1/criteria/generate`	60 req/min	Typically once per job posting
`POST /v1/criteria/questions`	60 req/min	Typically once per job posting
`GET /v1/score/{scoringJobId}`	1,000 req/min	Polling fallback
`GET /v1/jobs/{jobId}/criteria`	100 req/min	Criteria lookup
`POST/PATCH /v1/jobs/{jobId}/criteria`	60 req/min	Criteria updates
`POST /v1/jobs/{jobId}/criteria/archive`	60 req/min	Criteria archival

API rate limits control how fast you can submit requests. Processing capacity (below) determines how fast jobs are completed.

Processing Capacity & Concurrency

Scoring requests are processed asynchronously. When you submit a score request, it’s queued and processed by our worker infrastructure.

Concurrency Limits

Limit Type	Default	Description
Concurrency	500	Maximum concurrent scoring jobs

Need higher limits? Contact us through the Embed Portal to discuss enterprise capacity options.

Understanding Throughput

Each scoring job takes approximately 30 seconds to complete. Here’s how concurrency translates to throughput:

Throughput formula: Jobs per minute = (Concurrency × 60s) ÷ Processing timeAt 500 concurrent workers: 500 × 60 ÷ 30 = 1,000 jobs/minute = 60,000 jobs/hour

Capacity Planning

Use this table to plan your integration based on expected daily volume:

Daily Volume	If Spread Evenly	Peak Hour (50% of daily)	Concurrency Needed
50,000	2,083/hr	25,000/hr	208
80,000	3,333/hr	40,000/hr	334
150,000	6,250/hr	75,000/hr	625 (contact us)
200,000+	8,333/hr	100,000/hr	Contact us

Default limit of 500 concurrent jobs comfortably supports 80,000+ applications per day even with concentrated peak hours.

Queue Behavior

When you submit more jobs than can be processed immediately, they’re queued:

Immediate start

If workers are available, your job starts processing immediately—no queue wait.

Queue when busy

If all 500 workers are occupied, new jobs wait in a FIFO (first-in, first-out) queue.

Continuous processing

As each worker completes (~30s), it picks up the next queued job automatically.

Webhook delivery

Results are sent via webhook as soon as each job completes—you don’t wait for batches.

Estimating Wait Time

When the system is under load, estimate wait time with this formula:

Wait time ≈ (queue position ÷ concurrency) × processing time

Example: You submit job #750 when 500 workers are busy and 250 jobs are already queued:

Your queue position: 251st
Wait time: (251 ÷ 500) × 30s ≈ 15 seconds

Steady-State Throughput

At full concurrency, the system processes jobs continuously:

Jobs are processed FIFO—first submitted, first scored. Queue depth is effectively unlimited for normal operations.

Response Headers

Every response includes rate limit information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1705312800

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed per minute
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when limit resets

Checking Rate Limit Status

Use the dedicated status endpoint to check your current rate limit usage across all categories without consuming any rate limit points:

curl -X GET https://embed.nova.dweet.com/v1/rate-limit/status \
  -H "Authorization: Bearer $NOVA_API_KEY"

Response:

{
  "categories": [
    {
      "category": "scoring",
      "displayName": "Scoring",
      "endpoints": [
        "POST /v1/score",
        "POST /v1/score/batch",
        "GET /v1/score/{scoringJobId}",
        "GET /v1/score/application/{applicationId}"
      ],
      "limit": 1000,
      "used": 153,
      "remaining": 847,
      "resetAt": 1705312800,
      "windowSeconds": 60
    },
    {
      "category": "criteria_generation",
      "displayName": "Criteria Generation",
      "endpoints": [
        "POST /v1/criteria/generate",
        "POST /v1/criteria/questions"
      ],
      "limit": 60,
      "used": 2,
      "remaining": 58,
      "resetAt": 1705312800,
      "windowSeconds": 60
    },
    {
      "category": "criteria_operations",
      "displayName": "Criteria Operations",
      "endpoints": [
        "GET /v1/criteria/{jobId}",
        "POST /v1/criteria/{jobId}",
        "PATCH /v1/criteria/{jobId}/{criterionId}",
        "DELETE /v1/criteria/{jobId}",
        "DELETE /v1/criteria/{jobId}/{criterionId}"
      ],
      "limit": 100,
      "used": 0,
      "remaining": 100,
      "resetAt": 0,
      "windowSeconds": 60
    }
  ],
  "timestamp": "2025-01-15T10:30:00Z"
}

Field	Description
`category`	Machine-readable category identifier
`displayName`	Human-readable name for display in dashboards
`endpoints`	List of endpoint patterns that count against this limit
`limit`	Maximum requests allowed per window
`used`	Requests consumed in current window
`remaining`	Requests remaining in current window
`resetAt`	Unix timestamp when the window resets (0 if no requests made yet)
`windowSeconds`	Window duration in seconds

This endpoint is useful for building dashboards that display your current API usage, or for proactively throttling requests before hitting limits. The endpoints array shows exactly which API calls count against each category’s limit.

Handling 429 Responses

When rate limited, you’ll receive:

HTTP 429 Too Many Requests
Retry-After header with seconds until reset

{
  "error": {
    "type": "https://embed.nova.dweet.com/errors/rate-limited",
    "code": "RATE_LIMITED",
    "status": 429,
    "message": "Rate limit exceeded. Please retry after the specified delay.",
    "requestId": "req_abc123def456"
  }
}

Recommended Client Behavior

Check Headers

Monitor X-RateLimit-Remaining to anticipate limits before hitting them.

Respect Retry-After

When you receive a 429, wait for the Retry-After duration before retrying.

Use Exponential Backoff

If retries continue to fail, increase wait times exponentially with jitter.

Queue Requests

Implement client-side request queuing rather than tight retry loops.

Implementation Example

async function callWithRetry(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
      const jitter = Math.random() * 1000; // Add jitter to prevent thundering herd

      console.log(`Rate limited. Retrying in ${retryAfter}s...`);
      await sleep(retryAfter * 1000 + jitter);
      continue;
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

Fair Use

Criteria and questions generation are included in your per-application pricing:

Endpoint	Expected Usage
`POST /v1/criteria/generate`	Once per job
`POST /v1/criteria/questions`	Once per job
`POST/PATCH /v1/jobs/{jobId}/criteria`	Anytime (no AI cost)

We don’t enforce hard quotas, but may reach out if we see unusual patterns (e.g., regeneration loops, generating criteria without scoring). Contact us if you have an atypical use case.

Batch Processing Best Practices

When processing large batches of candidates:

Submit at a steady rate

Rather than submitting 10,000 candidates instantly, spread submissions over time. At 1,000 req/min limit, you can submit ~16 per second sustainably.

Trust the queue

Our queue handles bursts gracefully. You don’t need to throttle submissions to match processing speed—just stay within API rate limits.

Use webhooks for results

Don’t poll for results. Webhooks deliver scores as soon as processing completes, regardless of queue position.

Monitor queue depth

If you’re submitting faster than processing, jobs will queue. This is expected and normal—jobs are processed in order.

Example: High-Volume Integration

class ScoringClient {
  constructor({ apiKey, maxConcurrent = 50 }) {
    this.apiKey = apiKey;
    this.pending = 0;
    this.maxConcurrent = maxConcurrent; // Client-side concurrency
  }

  async scoreCandidate(payload) {
    // Wait if we have too many pending requests (client-side throttle)
    while (this.pending >= this.maxConcurrent) {
      await sleep(100);
    }

    this.pending++;
    try {
      const response = await fetch('https://embed.nova.dweet.com/v1/score', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${this.apiKey}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify(payload),
      });

      if (response.status === 429) {
        const retryAfter = parseInt(response.headers.get('Retry-After') || '60');
        await sleep(retryAfter * 1000);
        return this.scoreCandidate(payload); // Retry
      }

      return response.json();
    } finally {
      this.pending--;
    }
  }

  async scoreBatch(candidates, onResult) {
    const promises = candidates.map(async (candidate) => {
      const result = await this.scoreCandidate(candidate);
      onResult?.(result);
      return result;
    });

    return Promise.all(promises);
  }
}

// Usage: Process 80,000 candidates
const client = new ScoringClient({
  apiKey: 'sk_live_xxx',
  maxConcurrent: 50  // Stay well under API rate limits
});

// Submit all candidates - they'll queue server-side
// Results arrive via webhook as processing completes
await client.scoreBatch(candidates, (result) => {
  console.log(`Queued: ${result.scoringJobId}`);
});

Getting started

Core concepts

API endpoints

Integration guides

API Request Limits

Processing Capacity & Concurrency

Concurrency Limits

Understanding Throughput

Capacity Planning

Queue Behavior

Estimating Wait Time

Steady-State Throughput

Response Headers

Checking Rate Limit Status

Handling 429 Responses

Recommended Client Behavior

Implementation Example

Fair Use

Batch Processing Best Practices

Example: High-Volume Integration

Getting started

Core concepts

API endpoints

Integration guides

​API Request Limits

​Processing Capacity & Concurrency

​Concurrency Limits

​Understanding Throughput

​Capacity Planning

​Queue Behavior

​Estimating Wait Time

​Steady-State Throughput

​Response Headers

​Checking Rate Limit Status

​Handling 429 Responses

​Recommended Client Behavior

​Implementation Example

​Fair Use

​Batch Processing Best Practices

​Example: High-Volume Integration

API Request Limits

Processing Capacity & Concurrency

Concurrency Limits

Understanding Throughput

Capacity Planning

Queue Behavior

Estimating Wait Time

Steady-State Throughput

Response Headers

Checking Rate Limit Status

Handling 429 Responses

Recommended Client Behavior

Implementation Example

Fair Use

Batch Processing Best Practices

Example: High-Volume Integration