When L = D + pQ Is Not Enough: Exactness Repair for BN254 Wrappers

junbyjun1238 · March 8, 2026, 2:44pm

When L = D + pQ Is Not Enough: Exactness Repair for BN254 Wrappers

TL;DR

A deferred-quotient row of the form L = D + pQ over \mathbb{F}_r is not, by itself, an integer-division statement.
If Q is left unbounded, that row can be algebraically vacuous in the outer field.
For one concrete BN254/M31 wrapper family, quotient range binding + canonical residue binding + no-wrap bounds restore exact rowwise \mathbb{Z} semantics.
In a bounded shared-input Halo2 comparison, the repaired construction remains consistently faster and much smaller in RSS than an explicit bit-decomposition baseline.

The issue

The core surface is the deferred-quotient row equation

L = D + pQ \quad \text{over } \mathbb{F}_r.

By itself, that field equation is not an integer statement. If p is invertible in the outer field and Q is left unbounded, the prover can set

Q = p^{-1}(L-D)

and satisfy the row relation in \mathbb{F}_r without establishing Euclidean division semantics over \mathbb{Z}.

This post is about that exactness gap and a lightweight algebraic repair for one concrete BN254/M31 wrapper family.

Scope

The claim here is:

the deferred-quotient equality can be algebraically vacuous at the outer-field level;
to read L = D + pQ as an exact integer identity, explicit missing premises are needed;
for a concrete BN254/M31 wrapper family, those premises can be enforced with a lightweight algebraic repair.

The benchmark claim follows the same boundary:

this is a bounded instantiated-family comparison,
not a claim of full-domain semantic equivalence across every wrapper realization.

The theorem-level statement

For the field equality to lift to an exact identity in \mathbb{Z}, the note requires:

quotient range binding,
canonical residue binding,
no-wrap bounds.

Under those conditions, a satisfied row relation in \mathbb{F}_r lifts to

\widehat{L} = \widehat{D} + p\widehat{Q} \quad \text{in } \mathbb{Z},

and on designated residue rows the residue/quotient pair becomes the unique Euclidean remainder/quotient.

So the manuscript proves a restricted-model rowwise exactness theorem for a concrete BN254 family set with one-limb 31/66-bit quotient classes. It is a theorem-level result about algebraic exactness. Backend transcript extraction, PCS-level soundness, Fiat-Shamir closure, and wrapper-level parity are separate questions and not the subject of this note.

What the repair actually adds

At a high level, the repair adds three things to the deferred-quotient rows:

Quotient binding. The quotient witness is forced into the intended small canonical class, so the generic outer-field witness Q = p^{-1}(L-D) is no longer admissible.
Canonical residue binding. On designated residue rows, the residue is bound to its canonical representative, so the row is no longer compatible with arbitrary field-level aliases of the same congruence class.
No-wrap control. Family-wise no-wrap bounds ensure that once both sides are lifted to integers, the satisfied field equality is forced to be the exact integer identity rather than a wrapped congruence.

Concretely, the benchmarked repair construction binds both the quotient cells and the residue-style cells with a lookup-heavy pattern rather than a full explicit bit decomposition. In the current instantiation, the quotient side is split into small lookup-bound chunks instead of a wide boolean decomposition, and the residue side replaces a 31-boolean style binding with a small number of table lookups plus a non-equality style gate such as (x-p)\nu_x = 1; the no-wrap audit then justifies the integer lift. That is the main reason it is interesting to compare against the A_secure baseline in the first place.

Benchmark evidence

After the manuscript draft, I ran a follow-up Halo2 benchmark package comparing:

A_secure: an explicit bit-decomposition baseline,
B_note: the algebraic repair construction.

Both arms run through the same Halo2 proof path, the same transcript family, the same fixed-k contract, and the same shared input profiles.

Shared-input repeated reruns at `k = 17`

For the parity-facing reruns at k_\mathrm{run}=17, I measured:

\mathrm{scale}=1,4,8,16,32,
both boundary and standard,
\mathrm{repeats}=3.

Across those points:

B_note prove-time ratio stayed in the range 0.821 to 0.884,
B_note verify-time ratio stayed in the range 0.631 to 0.716,
B_note RSS stayed at about 0.285 to 0.286 of A_secure.

So within the current fixed-k, shared-input benchmark regime, B_note shows a fairly consistent proving-time and memory advantage.

First `k` jump and post-jump repeated check

I also checked where the current public harness first stops fitting inside k = 17.

For this benchmark contract:

\mathrm{scale}=4519 still fits in k_\min=17,
\mathrm{scale}=4520 is the first point that forces k_\min=18.

This is a property of the current harness.

Then I ran the first post-jump repeated validation at:

\mathrm{scale}=4520,
k_\mathrm{run}=18,
\mathrm{input\_profile}=\mathrm{boundary},
\mathrm{repeats}=3.

That produced:

A_secure prove ~32.37 s,
B_note prove ~26.71 s,
B/A prove ~0.825,
A RSS ~6.30 GB,
B RSS ~1.79 GB,
B/A RSS ~0.285.

So the advantage persists beyond the first k jump as well, at least for this minimum repeated post-jump check.

Links

manuscript PDF : https://github.com/junbyjun1238/zk2-bench-b1-lite/releases/download/v0.1.0/deferred-quotient-vacuity-in-bn254-wrappers-a-repair-note-on-algebraic-exactness.pdf
public benchmark package : https://github.com/junbyjun1238/zk2-bench-b1-lite/tree/v0.1.0/post_manuscript_fullbench
release snapshot: https://github.com/junbyjun1238/zk2-bench-b1-lite/releases/tag/v0.1.0
Zenodo archive (DOI) : https://doi.org/10.5281/zenodo.18910795

When L = D + pQ Is Not Enough: Exactness Repair for BN254 Wrappers

When L = D + pQ Is Not Enough: Exactness Repair for BN254 Wrappers

TL;DR

The issue

Scope

The theorem-level statement

What the repair actually adds

Benchmark evidence

Shared-input repeated reruns at k = 17

First k jump and post-jump repeated check

Links

Shared-input repeated reruns at `k = 17`

First `k` jump and post-jump repeated check