Free
$0
Try it out, evaluate, ship side projects.
- 50 documents/month
- Up to 3 templates
- Visual field-picker + review queue
- Community support
- Extraction tiers
- Standard
- ✓ Included
- Premium
- unavailable
- Multi-pass
- unavailable
Pricing
Apache-2.0 means you can run DocuExtract on your own infrastructure at any volume, forever, for $0. The hosted tier exists for users who don't want to run infrastructure — and the paid plans pay for the GPU compute, not the software.
Plans
Every plan includes the visual field-picker, HITL review queue, verbatim grounding, audit trail, all supported languages, and API access. Higher plans bundle more extraction tiers — see the “Extraction tiers” section on each card for exactly what's included vs. overage.
$0
Try it out, evaluate, ship side projects.
$0
Run it on your own infrastructure. Apache-2.0.
$99/mo
Indie consultants, accountants, small ops teams.
$499/mo
Mid-market ops teams running real volume.
$1,999/mo
High-volume SMB, scaleups, regulated workflows.
$4,999/mo
Mid-enterprise. Dedicated capacity, custom limits.
How extraction tiers work
Every plan includes Standardtier extractions (Tier 0–2 OCR — Tesseract / PaddleOCR) up to the plan's monthly quota. Standard handles 80% of typical documents — born-digital PDFs, clean English/Spanish scans, structured forms.
Premium tier (adds vision-LLM OCR for handwriting, multilingual, degraded scans) and Multi-pass(2-pass agreement + tiebreaker for high-stakes accuracy) are bundled into higher plans and available as per-doc overage on lower ones at a plan-discounted rate. Each plan card shows exactly what's included and what costs extra.
API pricing (or plan overage)
Tier-based per-call pricing so simple documents don't subsidize hard ones. These are the list rates — used as-is by API-only customers, or applied as discounted overage on hosted plans for tiers not included in your subscription. $25/mo minimum applies to API-only access; hosted plans already cover the always-on infra.
$0.10/doc
Tier 0–2 OCR (embedded text + Tesseract + PaddleOCR). Hard cases route to the review queue rather than escalating.
Best for
Batch invoice/form workflows, born-digital PDFs.
$0.25/doc
Full cascade including Tier 3 vision-LLM OCR (Qwen 2.5-VL 7B). Handles handwriting, multilingual scans, complex layouts.
Best for
Documents that defeat traditional OCR.
$0.40/doc
Full cascade + two independent LLM extraction passes with disagreement detection. Third-call tiebreaker, then HITL routing.
Best for
High-stakes extractions: legal, medical, financial.
Volume discounts kick in at 10K docs/mo. Annual commits get 15–20% off. Contact us for enterprise pricing — we'll quote against your real workload, not list price.
Enterprise · Managed · Consulting
DocuExtract is open source — you can run it yourself, forever, for free. When that isn't practical (regulated industry, scale, custom models, dedicated SLA), Inspire AI Lab offers managed deployment and custom engineering on top of it.
Custom
100,000+ docs/month, custom SLA, single-vendor procurement.
$25K–60K/yr
We run DocuExtract inside your VPC. On your hardware, on your models.
Project quote
Inspire AI Lab consulting engagement on top of Extract.
FAQ
50 docs/month is enough to evaluate against a real workload. Upgrade, downgrade, or switch to self-host whenever you want.