Skip to content

Asynchronous Extraction

Use the canonical async job API for create -> poll -> result.

  1. POST /api/v1/extraction-jobs
  2. GET /api/v1/extraction-jobs/:id
  3. GET /api/v1/extraction-jobs/:id/result
  • Authorization: api-key <environment-api-key>

Job reads are scoped by the caller’s environment. The job id alone is not sufficient.

  • Path: POST /api/v1/extraction-jobs
  • Method: POST

Send the same top-level extraction fields used by sync extraction, plus async delivery selection:

  1. file required
  2. filename optional for JSON base64 uploads
  3. templateName required unless filterName is used
  4. filterName optional
  5. documentSplitting optional boolean
  6. returnDocuments optional boolean
  7. returnText optional boolean
  8. documentReview optional whole number from 1 to 99
  9. schemaChunking optional: auto or combined
  10. password optional string
  11. deliveryMode optional: poll or webhook

Optional header:

  1. Idempotency-Key

Contract rules:

  1. If deliveryMode is omitted, the runtime defaults to poll
  2. Nested options.* is rejected
  3. delivery.mode is accepted as a compatibility alias for deliveryMode
  4. An idempotent replay returns the existing job instead of creating a second one
  5. returnBoundingBoxes is deprecated and ignored; new clients should omit it
  6. documentReview enables a manual review gate when extracted field confidence is below the threshold

New job:

{
"success": true,
"job": {
"id": "uuid",
"status": "queued",
"statusUrl": "/api/v1/extraction-jobs/uuid",
"retryAfterMs": 5000
}
}

Idempotent replay:

  1. HTTP 200 OK
  2. X-Idempotent-Replay: true
  3. Same response body shape as normal create
  1. 202 Accepted job created
  2. 200 OK idempotent replay returned an existing job
  3. 400 Bad Request invalid request shape, invalid delivery mode, or unresolved webhook coverage
  4. 401 Unauthorized missing or invalid API key
  5. 404 Not Found requested template/filter-scoped resource not found
  6. 409 Conflict same Idempotency-Key is currently being created and no completed mapping is available yet
Terminal window
curl -X POST "$BASE_URL/api/v1/extraction-jobs" \
-H "Authorization: api-key $API_KEY" \
-H "Idempotency-Key: inv-1001-upload-1" \
-F "file=@invoice.pdf" \
-F "templateName=invoice" \
-F "deliveryMode=poll"
{
"success": true,
"job": {
"id": "2c0d2f0e-1f4e-4c11-a0d6-64cc7b71697d",
"status": "queued",
"statusUrl": "/api/v1/extraction-jobs/2c0d2f0e-1f4e-4c11-a0d6-64cc7b71697d",
"retryAfterMs": 5000
}
}
  • Path: GET /api/v1/extraction-jobs/:id
  • Method: GET

Queued:

{
"success": true,
"job": {
"id": "uuid",
"status": "queued",
"createdAt": "2026-03-12T10:00:00.000Z",
"updatedAt": "2026-03-12T10:00:00.000Z",
"retryAfterMs": 5000
}
}

Processing:

{
"success": true,
"job": {
"id": "uuid",
"status": "processing",
"createdAt": "2026-03-12T10:00:00.000Z",
"updatedAt": "2026-03-12T10:00:20.000Z",
"progress": 45,
"retryAfterMs": 5000
}
}

Review required:

{
"success": true,
"job": {
"id": "uuid",
"status": "review_required",
"createdAt": "2026-03-12T10:00:00.000Z",
"updatedAt": "2026-03-12T10:01:00.000Z",
"retryAfterMs": 5000,
"resultAvailable": false,
"reviewRequiredAt": "2026-03-12T10:01:00.000Z",
"reviewThreshold": 92,
"reviewFlaggedFieldCount": 3
}
}

Completed:

{
"success": true,
"job": {
"id": "uuid",
"status": "completed",
"createdAt": "2026-03-12T10:00:00.000Z",
"updatedAt": "2026-03-12T10:01:03.000Z",
"completedAt": "2026-03-12T10:01:03.000Z",
"resultAvailable": true,
"resultUrl": "/api/v1/extraction-jobs/uuid/result"
}
}

Failed:

{
"success": true,
"job": {
"id": "uuid",
"status": "failed",
"createdAt": "2026-03-12T10:00:00.000Z",
"updatedAt": "2026-03-12T10:00:45.000Z",
"failedAt": "2026-03-12T10:00:45.000Z",
"error": {
"message": "Document processing failed.",
"code": "PROCESSING_ERROR"
}
}
}
  1. 200 OK status returned
  2. 401 Unauthorized missing or invalid API key
  3. 404 Not Found job missing or outside the caller’s environment scope
  4. 429 Too Many Requests per-job status polling limit exceeded

When a job is review_required, the extraction result is intentionally held back until the configured review process approves it. Poll the status endpoint again after review, or wait for the completed webhook if webhook delivery is enabled.

  • Path: GET /api/v1/extraction-jobs/:id/result
  • Method: GET
{
"success": true,
"job": {
"id": "uuid",
"status": "completed",
"completedAt": "2026-03-12T10:01:03.000Z"
},
"data": {
"completionTime": 63.2,
"originalName": "invoice.pdf",
"processedPages": 3,
"documents": []
}
}

The data payload matches the async extraction result and may include:

  1. completionTime
  2. originalName
  3. processedPages
  4. documents[] with the same document fields used by sync extraction
  1. 200 OK completed result returned
  2. 401 Unauthorized missing or invalid API key
  3. 404 Not Found job missing or outside the caller’s environment scope
  4. 409 Conflict job exists but result is not ready yet (RESULT_NOT_READY)
  5. 409 Conflict job is awaiting manual review (REVIEW_REQUIRED)
  6. 409 Conflict job failed (JOB_FAILED)
  7. 410 Gone job completed earlier but the stored result is no longer available (RESULT_EXPIRED)
Terminal window
curl -X POST "$BASE_URL/api/v1/extraction-jobs" \
-H "Authorization: api-key $API_KEY" \
-F "file=@invoice.pdf" \
-F "templateName=invoice"
Terminal window
curl -X GET "$BASE_URL/api/v1/extraction-jobs/$JOB_ID" \
-H "Authorization: api-key $API_KEY"
Terminal window
curl -X GET "$BASE_URL/api/v1/extraction-jobs/$JOB_ID/result" \
-H "Authorization: api-key $API_KEY"