Gas per Secure Bit: a normalized benchmark for PQ signatures on EVM
Happy holidays everyone.
Following up on the AA / ERC-4337 / PQ signatures discussion in this thread:
I ended up isolating one missing piece that keeps coming up implicitly:
We don’t have a normalized unit to compare different signature schemes at different security levels on EVM.
Most comparisons use “gas per verify”, but that silently mixes:
-
different security targets (e.g., ~128-bit ECDSA vs Cat3/Cat5 PQ schemes),
-
different verification surfaces (EOA vs ERC-1271 / AA),
-
and sometimes different benchmark scopes (pure verify vs full handleOps pipelines).
That makes it hard to answer basic engineering questions like:
“Is ML-DSA-65 viable on EVM relative to Falcon, under explicit assumptions?”
What I built
A small benchmark lab + dataset with explicit provenance and explicit security denominators:
Repo: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.
Core idea:
gas_per_secure_bit = gas_verify / security_bits
I intentionally report two denominators, because both viewpoints are useful:
Metric A — Baseline normalization (128-bit baseline)
This answers: “What is the cost per 128-bit baseline unit?”
gas_per_128b = gas_verify / 128
This is not claiming every scheme is 128-bit secure; it’s just a budgeting/normalization tool.
Metric B — Security-equivalent bits (declared convention)
This answers: “How costly is each ‘security bit’ under a declared normalization convention?”
gas_per_sec_equiv_bit = gas_verify / security_equiv_bits
For signatures I currently use the following explicit convention:
| Scheme | NIST category (where applicable) | security_equiv_bits |
|---|---|---|
| ECDSA (secp256k1) | — | 128 |
| ML-DSA-65 (FIPS-204, Cat 3) | 3 | 192 |
| Falcon-1024 (Cat 5) | 5 | 256 |
I use a simple mapping Cat{1,3,5} → {128,192,256} as a declared normalization convention (open to better community conventions).
Note: security_equiv_bits is a declared normalization convention for comparability. It is not a security proof and not a NIST-provided “bits” value.
Category sources:
Provenance & reproducibility
All numbers are currently single-run gas snapshots (no averaging) with full provenance:
repo, commit, bench_name, chain_profile, and a notes field.
No hidden averaging, no “best-of-N” selection — just reproducible snapshots others can rerun.
A few key rows (baseline normalization — divide by 128)
| Scheme / bench | Gas | gas_per_128b | Notes |
|---|---|---|---|
| ECDSA ecrecover | 21,126 | 165 | classical baseline; not PQ-secure (Shor) |
| Falcon getUserOpHash | 218,333 | 1,705 | small AA primitive |
| ML-DSA-65 PreA (isolated hot-path) | 1,499,354 | 11,714 | optimized compute core |
| Falcon full verify | 10,336,055 | 80,751 | PQ full verify |
| ML-DSA-65 verify POC | 68,901,612 | 538,294 | end-to-end POC |
Security-equivalent normalization (divide by security_equiv_bits)
| Scheme / bench | Gas | security_equiv_bits | gas_per_sec_equiv_bit |
|---|---|---|---|
| Falcon getUserOpHash | 218,333 | 256 | 853 |
| ML-DSA-65 PreA | 1,499,354 | 192 | 7,809 |
| Falcon full verify | 10,336,055 | 256 | 40,375 |
| ML-DSA-65 verify POC | 68,901,612 | 192 | 358,863 |
What stood out to me:
-
ML-DSA-65 PreA lands at ~7,809 gas / security-equivalent bit (Cat3-equivalent)
-
Falcon-1024 full verify lands at ~40,375 gas / security-equivalent bit (Cat5-equivalent)
That’s roughly a 5.2× difference for those specific benches.
This is not “ML-DSA beats Falcon overall”; it’s a narrower claim:
some ML-DSA verification surfaces can be made much more EVM-friendly if you avoid recomputing heavy public structure on-chain.
What “PreA” means (why it changes the picture)
In standard ML-DSA verification, a large portion of the cost is effectively:
ExpandA + converting the public matrix into the NTT domain.
The “PreA” path isolates the hot arithmetic core (A·z − c·t₁ in the NTT domain) by accepting A_ntt precomputed, and binding it with CommitA to prevent matrix substitution.
In my harness, A_ntt is derived from the public key seed (rho) and then bound via CommitA to prevent substitution.
This is an explicit engineering design point (especially in AA contexts): move large public structure off-chain, but keep it cryptographically bound.
Rough breakdown (current harness):
-
Full compute_w with on-chain ExpandA+NTT(A): ~64.8M gas
-
Isolated matrix multiply core (PreA): ~1.5M gas
Implementation:
-
ML-DSA-65 verifier: GitHub - pipavlo82/ml-dsa-65-ethereum-verification
-
Benchmark lab: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.
Why this matters for AA / ERC-7913
In AA, the unit you care about is rarely “verify one signature in isolation”.
You care about stable ABI surfaces and comparability across candidates.
ERC-7913 provides a generic verification interface.
My working assumption: if we want PQ adoption to be engineered (not guessed), we need:
-
a shared benchmark schema,
-
explicit security denominators,
-
and comparable surfaces (pure verify vs AA pipeline).
Open questions / feedback welcome
1) Hash/XOF wiring on EVM
For EVM implementations: do we want (a) strict FIPS SHAKE wiring, (b) Keccak-based non-conformant variants, or (c) dual-mode implementations with explicit labeling in the dataset?
2) Is the dual-metric approach reasonable?
Baseline normalization is useful for budgeting; security-equivalent bits are useful for honest efficiency per security unit. Any objections to reporting both?
3) PreA standardization options
What’s the least-bad approach in AA context?
-
calldata (large, but stateless),
-
storage per key,
-
precompile,
-
hybrid with CommitA binding?
Reproducibility quick start
git clone https://github.com/pipavlo82/gas-per-secure-bit
cd gas-per-secure-bit
RESET_DATA=0 MLDSA_REF="feature/mldsa-ntt-opt-phase12-erc7913-packedA" \
./scripts/run_vendor_mldsa.sh
RESET_DATA=0 ./scripts/run_ecdsa.sh
QA_REF=main RESET_DATA=0 ./scripts/run_vendor_quantumaccount.sh
tail -n 20 data/results.csv
Thanks for reading — I’m very open to corrections on conventions, better threat-model framing, and suggestions on which schemes/surfaces to add next.
Links
-
Benchmark repo: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.
-
ML-DSA-65 implementation: GitHub - pipavlo82/ml-dsa-65-ethereum-verification
-
Chart (raw SVG): gas-per-secure-bit/docs/gas_per_secure_bit.svg at main · pipavlo82/gas-per-secure-bit · GitHub
-
ERC-7913: ERC-7913: Signature Verifiers
-
Original AA/PQ discussion: The road to Post-Quantum Ethereum transaction is paved with Account Abstraction (AA)