When L = D + pQ Is Not Enough: Exactness Repair for BN254 Wrappers
TL;DR
-
A deferred-quotient row of the form L = D + pQ over \mathbb{F}_r is not, by itself, an integer-division statement.
-
If Q is left unbounded, that row can be algebraically vacuous in the outer field.
-
For one concrete BN254/M31 wrapper family, quotient range binding + canonical residue binding + no-wrap bounds restore exact rowwise \mathbb{Z} semantics.
-
In a bounded shared-input Halo2 comparison, the repaired construction remains consistently faster and much smaller in RSS than an explicit bit-decomposition baseline.
The issue
The core surface is the deferred-quotient row equation
By itself, that field equation is not an integer statement. If p is invertible in the outer field and Q is left unbounded, the prover can set
and satisfy the row relation in \mathbb{F}_r without establishing Euclidean division semantics over \mathbb{Z}.
This post is about that exactness gap and a lightweight algebraic repair for one concrete BN254/M31 wrapper family.
Scope
The claim here is:
-
the deferred-quotient equality can be algebraically vacuous at the outer-field level;
-
to read L = D + pQ as an exact integer identity, explicit missing premises are needed;
-
for a concrete BN254/M31 wrapper family, those premises can be enforced with a lightweight algebraic repair.
The benchmark claim follows the same boundary:
-
this is a bounded instantiated-family comparison,
-
not a claim of full-domain semantic equivalence across every wrapper realization.
The theorem-level statement
For the field equality to lift to an exact identity in \mathbb{Z}, the note requires:
-
quotient range binding,
-
canonical residue binding,
-
no-wrap bounds.
Under those conditions, a satisfied row relation in \mathbb{F}_r lifts to
and on designated residue rows the residue/quotient pair becomes the unique Euclidean remainder/quotient.
So the manuscript proves a restricted-model rowwise exactness theorem for a concrete BN254 family set with one-limb 31/66-bit quotient classes. It is a theorem-level result about algebraic exactness. Backend transcript extraction, PCS-level soundness, Fiat-Shamir closure, and wrapper-level parity are separate questions and not the subject of this note.
What the repair actually adds
At a high level, the repair adds three things to the deferred-quotient rows:
-
Quotient binding. The quotient witness is forced into the intended small canonical class, so the generic outer-field witness Q = p^{-1}(L-D) is no longer admissible.
-
Canonical residue binding. On designated residue rows, the residue is bound to its canonical representative, so the row is no longer compatible with arbitrary field-level aliases of the same congruence class.
-
No-wrap control. Family-wise no-wrap bounds ensure that once both sides are lifted to integers, the satisfied field equality is forced to be the exact integer identity rather than a wrapped congruence.
Concretely, the benchmarked repair construction binds both the quotient cells and the residue-style cells with a lookup-heavy pattern rather than a full explicit bit decomposition. In the current instantiation, the quotient side is split into small lookup-bound chunks instead of a wide boolean decomposition, and the residue side replaces a 31-boolean style binding with a small number of table lookups plus a non-equality style gate such as (x-p)\nu_x = 1; the no-wrap audit then justifies the integer lift. That is the main reason it is interesting to compare against the A_secure baseline in the first place.
Benchmark evidence
After the manuscript draft, I ran a follow-up Halo2 benchmark package comparing:
-
A_secure: an explicit bit-decomposition baseline, -
B_note: the algebraic repair construction.
Both arms run through the same Halo2 proof path, the same transcript family, the same fixed-k contract, and the same shared input profiles.
Shared-input repeated reruns at k = 17
For the parity-facing reruns at k_\mathrm{run}=17, I measured:
-
\mathrm{scale}=1,4,8,16,32,
-
both
boundaryandstandard, -
\mathrm{repeats}=3.
Across those points:
-
B_noteprove-time ratio stayed in the range0.821to0.884, -
B_noteverify-time ratio stayed in the range0.631to0.716, -
B_noteRSS stayed at about0.285to0.286ofA_secure.
So within the current fixed-k, shared-input benchmark regime, B_note shows a fairly consistent proving-time and memory advantage.
First k jump and post-jump repeated check
I also checked where the current public harness first stops fitting inside k = 17.
For this benchmark contract:
-
\mathrm{scale}=4519 still fits in k_\min=17,
-
\mathrm{scale}=4520 is the first point that forces k_\min=18.
This is a property of the current harness.
Then I ran the first post-jump repeated validation at:
-
\mathrm{scale}=4520,
-
k_\mathrm{run}=18,
-
\mathrm{input\_profile}=\mathrm{boundary},
-
\mathrm{repeats}=3.
That produced:
-
A_secure prove ~32.37 s, -
B_note prove ~26.71 s, -
B/A prove ~0.825, -
A RSS ~6.30 GB, -
B RSS ~1.79 GB, -
B/A RSS ~0.285.
So the advantage persists beyond the first k jump as well, at least for this minimum repeated post-jump check.
Links
- manuscript PDF : https://github.com/junbyjun1238/zk2-bench-b1-lite/releases/download/v0.1.0/deferred-quotient-vacuity-in-bn254-wrappers-a-repair-note-on-algebraic-exactness.pdf
- public benchmark package : https://github.com/junbyjun1238/zk2-bench-b1-lite/tree/v0.1.0/post_manuscript_fullbench
- release snapshot: https://github.com/junbyjun1238/zk2-bench-b1-lite/releases/tag/v0.1.0
- Zenodo archive (DOI) : https://doi.org/10.5281/zenodo.18910795