DOCUMENT SURFACE · documents.doloop.io

Your AI invented a number
while it was reading.

The doloop surface for pulling data out of documents. It runs a donkey (a deterministic check that returns the same result for the same input) that extracts the tables from a PDF and ties every cell back to the spot on the page it came from. Nothing is guessed, and you can prove the source of each number.

What it catches → Try it on a PDF

What it catches

When an AI reads a document and gets bored, it fills the gaps. The donkey does not.

INVENTED

Numbers that were never there

Ask a model to read a statement and it will confidently return figures that do not appear in the document. It fabricates to finish rather than admit it could not find them.

MISREAD

The wrong cell

A total lands in the wrong row, a column shifts, a footnote merges into a value. The number is real but attached to the wrong thing, which is just as wrong.

UNPROVABLE

No way to check it

Even a correct extraction is useless to an audit team if you cannot point at where it came from. No provenance means no sign-off.

The donkey here

WYSIWYD. Deterministic: the same PDF gives the same numbers, every time. Live today.

• WYSIWYD

What you see is what you download

Why: a model will invent a number to finish the job; a script cannot.
How: it detects the table structure on the page and extracts each cell mechanically, then ties every value back to its place on the page. 100% reproducible across 90 extractions, 0 errors on 2,332 cells. Try it on your own PDF.

See it work

An invoice in, a sourced table out. Each value carries its source.

$ check invoice.pdf

  verdict: PASS      cells: 162      sourced: 162 / 162

  every value traced to a box on the page:
    "Subtotal  1,240.00"   page 1, box 84    ✓
    "Tax         99.20"    page 1, box 91    ✓
    "Total     1,339.20"   page 1, box 97    ✓   (= subtotal + tax)

  run it again on the same PDF: byte-identical output.

Run it twice and the output matches byte for byte, every value traced to a box on the page.

Two ways to use it

Call the donkey on a file, or run the surface inside the doloop machine: the service that wraps your own AI, runs the check on every output, and keeps state across runs. The difference is state.

CALL IT DIRECTLY

The donkey, standalone

Send a PDF, get the sourced table back. Stateless and simple: one file in, one sourced table out. Try it free right now, no sign-up. Nothing remembered.

• THROUGH THE MACHINE

Stateful and remembered

Connect your own AI to the doloop machine in document mode and the donkey runs on every extraction. Your AI reads, the donkey ties every number to the page and rejects anything it cannot source, and only verified data ships. The machine learns your document templates.

Want this on your document pipeline? Talk to us, or see the other surfaces.