Layout-aware
Tables become real Markdown tables. Formulas become LaTeX. Handwriting becomes readable text. Column order preserved.
A free, keyless OCR API backed by open-weight VLMs. Markdown or compilable LaTeX out. Three lines of Python or TypeScript — no signup, no key, no BS. The first of many endpoints on CodeSOTA.
# pip install hardparse
from hardparse import parse
md = parse("invoice.pdf")
print(md) # → Markdown with tables, formulas, layout
Most OCR tools flatten documents into sloppy text. Hardparse preserves structure — so the Markdown you get is the Markdown you'd write.
Tables become real Markdown tables. Formulas become LaTeX. Handwriting becomes readable text. Column order preserved.
Markdown for RAG and LLM pipelines. A compilable standalone .tex for papers. Same API, swap one flag.
Open-weight models. Benchmarked publicly on CodeSOTA. No black box — you can audit, self-host, or fork.
Ship without waiting for procurement. The API is anonymous, rate-limited by IP, and the SDKs carry zero dependencies. Email hi@hardparse.com for higher limits.
# pip install hardparse
from hardparse import parse, parse_latex
# Markdown out
md = parse("invoice.pdf")
# Or LaTeX — compilable standalone document
tex = parse_latex("paper.pdf")
open("out.tex", "w").write(tex)
# $ xelatex out.tex → out.pdf
// npm i @codesota/ocr
import { parse, parseLatex } from "@codesota/ocr";
// Markdown out
const md = await parse(file); // File | Blob | path
// Or LaTeX — compilable standalone document
const tex = await parseLatex("paper.pdf");
await writeFile("out.tex", tex);
// $ xelatex out.tex → out.pdf
# Markdown out (default)
curl -F "file=@invoice.pdf" \
https://hardparse.com/v1/parse
# LaTeX out — compilable standalone .tex
curl -F "file=@paper.pdf" \
"https://hardparse.com/v1/parse?format=latex"
Zero deps. Runs in Node 18+, browsers, Python 3.9+. Source on GitHub.
Same pipeline, same models. No install. Supports PDF, images, and scans up to 200 MB.
Three stages, one request. We detect the layout, recognize the content, and reconstruct it as structured Markdown.
A document layout model finds tables, formulas, text blocks, figures, and their reading order — even across multi-column PDFs.
Each region goes to a vision-language model tuned for its type — table → OTSL, formula → LaTeX, text → plain. Concurrent on GPU.
Results merge back in the original reading order. You get a single Markdown string plus per-page structure for downstream LLMs or RAG.
Na tle alternatyw: vendor APIs charge per page and give you less structure. Open models match or beat them at a fraction of the cost.
| Usługa | Cena | Jakość |
|---|---|---|
| Hardparse ● live | $19/mies. ryczałt · 100/day free | 🟢 VLM na GPU |
| Google Document AI | $0.10–1.50/strona | 🟢 mocny |
| Azure Doc Intelligence | $0.50–10/1K stron | 🟢 mocny |
| Tesseract | darmowy | 🔴 nie radzi sobie z tabelami |
Our OCR model selection isn't vibes. CodeSOTA tracks 164+ models across 97 real-world benchmarks — with reproducible scores, cost, and licensing. Dig in before you trust us.
Browse the benchmarksTak. 5 stron miesięcznie za darmo, bez karty. Potem $19/mies. ryczałtem bez limitu. Podniosę cenę, jak pojawi się realny ruch.
Tak — darmowe, bez klucza. POST /v1/parse albo SDK w Pythonie / TypeScripcie. 100 zapytań dziennie na IP. Szczegóły w sekcji API powyżej.
Jeszcze nie. Jeśli potrzebujesz on-prem z powodów compliance, napisz — coś wymyślimy.
Potrzebowałem porządnego OCR do własnego pipeline'u RAG — takiego, który zachowuje tabele, wzory i układ, zamiast spłaszczać wszystko do niechlujnego tekstu. Istniejące rozwiązania albo gubiły strukturę, albo szybko robiły się drogie. Więc zbudowałem to. Teraz ty też możesz z tego korzystać.
One endpoint per AI task. Backed by benchmarks you can audit. TTS, STT, and segmentation are next.