OrderPier

Our number, not a vendor number

Per-field extraction accuracy, published.

Marketing pages love to quote “95–99% accuracy.” That's a vendor benchmark — yours might be different. This page shows our own results from our own eval harness, on a variable test set of synthetic POs covering 8+ layout variants. The exact harness and the test set are in the public repo.

Benchmark run pending

We haven't published a number yet. The eval harness lives in src/moa/eval.py; run python -m moa.cli eval --count 200and copy the resulting report to web/public/eval/latest.json — this page will render it.

We'd rather show a blank than an invented number.

Methodology

The harness lives in src/moa/eval.py. Scoring rules:

  • IDs (PO #, dates): exact match after normalization (case + whitespace; dates parsed to ISO).
  • Customer name, ship-to: rapidfuzz ratio ≥ 0.85 / 0.85.
  • Line items: SKU-exact pass first, then Hungarian assignment over description fuzz with floor 0.70.
  • Per-sub-field: SKU exact, quantity/price numeric round-to-2, description fuzzy ≥ 0.80, unit exact.

Each PO renders in one of 8+ layout variants (clean table, nested header, dense, scan-style, multi-currency, sparse free-text body, handwritten annotations, multi-page long). To reproduce, clone the repo and run the CLI — the test set is checked in to eval/ground_truth.

Repo · Security · Sub-processors

Run the harness on your own POs.

Email us a few anonymized samples — we'll score them with our model on your test set, and you'll see the per-field number.