Templates
A template is a reusable schema describing what fields to extract from a class of documents. Build one, run many similar documents through it.
What a template is
A template is a list of fields— each with a name, a type, and optional hints. When you extract a document with a template, the engine produces one value per field (or null when a field can't be found).
Templates are versioned. Editing a template bumps its version; past batches keep referring to the version they ran against, so corrections to a template don't invalidate historical extraction results.
Two ways to create one
Starter invoice template
On /dashboard/templates, click Create starter invoice template. You get 6 fields:
invoice_number(text, anchor: "Invoice No.")invoice_date(date, anchor: "Date")vendor(text, anchor: "Vendor")total(currency, anchor: "Total")subtotal(currency, anchor: "Subtotal")tax(currency, anchor: "Tax")
Best when your documents look like a standard US/Western invoice. Edit any field afterward to fit your exact format.
Visual builder
Click Build template. Upload a sample document; the page renders, and you click-and-drag to draw bounding boxes around fields you want extracted. Name each one, pick its type, save.
The bounding boxes are stored as authoring metadata. Extraction uses anchor-based lookup by default (more robust to layout drift across documents) but the boxes give you a visual record of where each field lives on a typical document.
Field types
| Type | Use for | Coercion |
|---|---|---|
text | Free-form strings | Returns the verbatim string |
number | Counts, quantities, IDs that are numeric | Coerced to Decimal |
date | Dates in any format (ISO, US, EU, long-form) | Coerced to ISO date (YYYY-MM-DD) |
currency | Money amounts | Coerced to Decimal in canonical form |
enum | Fixed set of allowed values | Constrained to enum_values |
table | Repeating row data (line items) | Returns array of objects |
Anchors
An anchor is a label that appears next to (or above) the value you want. Examples: "Invoice No.", "Total:", "Vendor:". The engine searches for the anchor text in the document, then grabs the adjacent value.
When to use an anchor:whenever a label exists. It's the most reliable extraction method — almost free (no LLM) and very accurate when the label is consistent across documents.
When to skip the anchor:if the value isn't preceded by a label (e.g. a vendor name at the top of the document with no "Vendor:" prefix). The extraction layer falls back to the LLM, using your description field as a hint.
Descriptions
A description is a natural-language hint sent to the LLM extractor. Example: description: "Issuing company at the top of the invoice". The LLM uses this to disambiguate when multiple candidate values exist.
Confidence threshold
Each template has a confidence_threshold (default 0.75). Fields extracted below this threshold are flagged needs_review = true and appear in the Review queue.
Set higher (e.g. 0.90) for high-stakes workflows where a wrong value costs more than an extra review action. Set lower (e.g. 0.60) for low-stakes bulk processing where occasional errors are acceptable.