Hardparse API

Submit documents, get structured output. Async job queue with real-time polling. Drop-in replacement for legacy OCR services.

Base URL https://hardparse.com

Quick Start

Extract text from a document in three steps.

1
Submit a document
Upload an image or PDF to the OCR endpoint.
curl -X POST https://HOST/api/ocr \ -F "file=@invoice.pdf"
2
Poll for results
Check the job status every 2s until done.
curl https://HOST/api/jobs/{job_id}
3
Get structured output
Read result_md or result_json.
{ "status": "done", "result_md": "# Invoice\n...", "elapsed": 13.6 }
Auth
POST/api/signup Create account and get API key
Create an account with your email. Returns an API key and 500 free pages. No credit card required.
Request Body
{ "email": "you@company.com" }
Responses
200Account created. API key and credits returned.
400Invalid email address.
409Account already exists.
Try it
email required
OCR
POST/api/ocr Submit document for OCR
Upload an image or PDF and receive a job ID. Poll GET /api/jobs/{job_id} for results.
  1. POST /api/ocr with your file to get a job_id
  2. Poll GET /api/jobs/{job_id} every 2 seconds
  3. When status is done, read result_md or result_json
Parameters
NameInTypeDescription
filerequiredbodybinaryImage or PDF (PNG, JPG, TIFF, WEBP, BMP, PDF). Max 20 MB.
formatquerystringDefault: "markdown"
markdownjson
languagequerystringLanguage hint (e.g. "polish", "english"). Default: "auto"
Responses
201Job created and queued.
422Validation error.
503Queue full.
Try it
file required
Drop file or click to select (PNG, JPG, PDF)
format
language
Jobs
GET/api/jobs/{job_id} Get job status and results
Returns full job details. When status is done, includes result_md and result_json.

Status flow: queuedprocessingdone | error
Parameters
NameInTypeDescription
job_idrequiredpathstringJob ID from POST /api/ocr
Responses
200Job details with results (when complete).
404Job not found.
Try it
job_id required
GET/api/jobs List recent jobs
Returns summary fields (no full results). Sorted by creation time, newest first.
Parameters
NameInTypeDescription
limitqueryintegerMax results (1-100). Default: 20
offsetqueryintegerSkip N jobs. Default: 0
Try it
limit
offset
DELETE/api/jobs/{job_id} Delete a job
Permanently removes the job record and uploaded file.
Responses
204Deleted. No content.
404Job not found.
Legacy
POST/ocr Synchronous OCR (blocks until complete)
Blocks until processing finishes (up to 5 min). For production, prefer async POST /api/ocr.
Responses
200Result returned directly as markdown or JSON.
503Queue full.
504Timeout (>5 min).
System
GET/health Health check
Server status, queue depth, jobs processed, uptime.
Try it