Gas/Security-Bit for PQ Signatures on EVM: Dataset + Methodology

pipavlo82 · December 18, 2025, 12:37pm

Gas per Secure Bit: a normalized benchmark for PQ signatures on EVM

Happy holidays everyone.

Following up on the AA / ERC-4337 / PQ signatures discussion in this thread:

I ended up isolating one missing piece that keeps coming up implicitly:

We don’t have a normalized unit to compare different signature schemes at different security levels on EVM.

Most comparisons use “gas per verify”, but that silently mixes:

different security targets (e.g., ~128-bit ECDSA vs Cat3/Cat5 PQ schemes),
different verification surfaces (EOA vs ERC-1271 / AA),
and sometimes different benchmark scopes (pure verify vs full handleOps pipelines).

That makes it hard to answer basic engineering questions like:
“Is ML-DSA-65 viable on EVM relative to Falcon, under explicit assumptions?”

What I built

A small benchmark lab + dataset with explicit provenance and explicit security denominators:

Repo: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.

Core idea:

gas_per_secure_bit = gas_verify / security_bits

I intentionally report two denominators, because both viewpoints are useful:

Metric A — Baseline normalization (128-bit baseline)

This answers: “What is the cost per 128-bit baseline unit?”

gas_per_128b = gas_verify / 128

This is not claiming every scheme is 128-bit secure; it’s just a budgeting/normalization tool.

Metric B — Security-equivalent bits (declared convention)

This answers: “How costly is each ‘security bit’ under a declared normalization convention?”

gas_per_sec_equiv_bit = gas_verify / security_equiv_bits

For signatures I currently use the following explicit convention:

Scheme	NIST category (where applicable)	security_equiv_bits
ECDSA (secp256k1)	—	128
ML-DSA-65 (FIPS-204, Cat 3)	3	192
Falcon-1024 (Cat 5)	5	256

I use a simple mapping Cat{1,3,5} → {128,192,256} as a declared normalization convention (open to better community conventions).

Note: security_equiv_bits is a declared normalization convention for comparability. It is not a security proof and not a NIST-provided “bits” value.

Category sources:

Provenance & reproducibility

All numbers are currently single-run gas snapshots (no averaging) with full provenance:
repo, commit, bench_name, chain_profile, and a notes field.

No hidden averaging, no “best-of-N” selection — just reproducible snapshots others can rerun.

A few key rows (baseline normalization — divide by 128)

Scheme / bench	Gas	gas_per_128b	Notes
ECDSA ecrecover	21,126	165	classical baseline; not PQ-secure (Shor)
Falcon getUserOpHash	218,333	1,705	small AA primitive
ML-DSA-65 PreA (isolated hot-path)	1,499,354	11,714	optimized compute core
Falcon full verify	10,336,055	80,751	PQ full verify
ML-DSA-65 verify POC	68,901,612	538,294	end-to-end POC

Security-equivalent normalization (divide by security_equiv_bits)

Scheme / bench	Gas	security_equiv_bits	gas_per_sec_equiv_bit
Falcon getUserOpHash	218,333	256	853
ML-DSA-65 PreA	1,499,354	192	7,809
Falcon full verify	10,336,055	256	40,375
ML-DSA-65 verify POC	68,901,612	192	358,863

What stood out to me:

ML-DSA-65 PreA lands at ~7,809 gas / security-equivalent bit (Cat3-equivalent)
Falcon-1024 full verify lands at ~40,375 gas / security-equivalent bit (Cat5-equivalent)

That’s roughly a 5.2× difference for those specific benches.

This is not “ML-DSA beats Falcon overall”; it’s a narrower claim:
some ML-DSA verification surfaces can be made much more EVM-friendly if you avoid recomputing heavy public structure on-chain.

What “PreA” means (why it changes the picture)

In standard ML-DSA verification, a large portion of the cost is effectively:
ExpandA + converting the public matrix into the NTT domain.

The “PreA” path isolates the hot arithmetic core (A·z − c·t₁ in the NTT domain) by accepting A_ntt precomputed, and binding it with CommitA to prevent matrix substitution.

In my harness, A_ntt is derived from the public key seed (rho) and then bound via CommitA to prevent substitution.

This is an explicit engineering design point (especially in AA contexts): move large public structure off-chain, but keep it cryptographically bound.

Rough breakdown (current harness):

Full compute_w with on-chain ExpandA+NTT(A): ~64.8M gas
Isolated matrix multiply core (PreA): ~1.5M gas

Implementation:

ML-DSA-65 verifier: GitHub - pipavlo82/ml-dsa-65-ethereum-verification
Benchmark lab: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.

Why this matters for AA / ERC-7913

In AA, the unit you care about is rarely “verify one signature in isolation”.
You care about stable ABI surfaces and comparability across candidates.

ERC-7913 provides a generic verification interface.

My working assumption: if we want PQ adoption to be engineered (not guessed), we need:

a shared benchmark schema,
explicit security denominators,
and comparable surfaces (pure verify vs AA pipeline).

Open questions / feedback welcome

1) Hash/XOF wiring on EVM
For EVM implementations: do we want (a) strict FIPS SHAKE wiring, (b) Keccak-based non-conformant variants, or (c) dual-mode implementations with explicit labeling in the dataset?

2) Is the dual-metric approach reasonable?
Baseline normalization is useful for budgeting; security-equivalent bits are useful for honest efficiency per security unit. Any objections to reporting both?

3) PreA standardization options
What’s the least-bad approach in AA context?

calldata (large, but stateless),
storage per key,
precompile,
hybrid with CommitA binding?

Reproducibility quick start

git clone https://github.com/pipavlo82/gas-per-secure-bit
cd gas-per-secure-bit

RESET_DATA=0 MLDSA_REF="feature/mldsa-ntt-opt-phase12-erc7913-packedA" \
  ./scripts/run_vendor_mldsa.sh

RESET_DATA=0 ./scripts/run_ecdsa.sh

QA_REF=main RESET_DATA=0 ./scripts/run_vendor_quantumaccount.sh

tail -n 20 data/results.csv

Thanks for reading — I’m very open to corrections on conventions, better threat-model framing, and suggestions on which schemes/surfaces to add next.

Links

Benchmark repo: GitHub - pipavlo82/gas-per-secure-bit: Gas per secure bit benchmarking for PQ signatures and VRF.
ML-DSA-65 implementation: GitHub - pipavlo82/ml-dsa-65-ethereum-verification
Chart (raw SVG): gas-per-secure-bit/docs/gas_per_secure_bit.svg at main · pipavlo82/gas-per-secure-bit · GitHub
ERC-7913: ERC-7913: Signature Verifiers
Original AA/PQ discussion: The road to Post-Quantum Ethereum transaction is paved with Account Abstraction (AA)