Skip to content
DocuExtract

Privacy

How we handle your data.

Stub. The formal privacy policy lands before the hosted tier accepts paid sign-ups. In the meantime, here's our position in plain language.

Self-host

When you run DocuExtract on your own infrastructure (Docker compose, Kubernetes, bare metal, your AWS/GCP/Azure VPC), no document or extraction data ever leaves your environment. The codebase doesn't phone home, send telemetry, or contact our servers. The only network calls are between the components you control (engine → Postgres → object storage → Ollama).

Hosted tier (docuextract.ai)

Documents you upload are stored encrypted at rest in our object storage. They are processed by our OCR cascade and inference layer running in our hosted infrastructure (Modal for GPU inference, Fly.io for everything else). Extraction results are stored alongside the documents.
  • Documents are retained for the duration of your subscription.
  • You can delete any document or batch at any time through the dashboard or API. Deletion removes the source file and all derived data from our storage.
  • Backups follow a 30-day rotation. Deletions propagate to backups on the next rotation.
  • We do not use your documents to train our base models. Period.

Inference provider

By default, the hosted tier sends document images and OCR'd text to a self-hosted Qwen 2.5-VL model running on Modal's GPU infrastructure. The model is open-weights and runs in our isolated container — there is no third-party LLM provider in the path by default. The optional "paid-API fallback" (Anthropic / OpenAI vision) is off by default and only activates per-template if you explicitly opt in.

Telemetry

The web app uses Vercel Analytics (anonymous page-view metrics, no personal data). The engine logs operational metrics (extraction count, error rate, OCR-tier distribution) for capacity planning — but never document content or extracted values.

Account data

Your email, billing details, and account preferences are stored in our Neon Postgres database. We use Stripe for billing — they handle card data; we don't see it.

Questions

Reach out to hello@inspireailab.com with any privacy questions. For managed-deployment customers, custom DPAs and compliance attestations (SOC 2, HIPAA, GDPR) are available — see consulting contact.