What if post-quantum Ethereum doesn’t need signatures at all?

What if post-quantum Ethereum doesn’t need signatures at all?

TL;DR: Current PQC migration plans assume we must verify post-quantum signatures — either on-chain (kilobytes per tx) or inside ZK circuits (millions of constraints). We present an alternative: prove authorization semantics directly in ZK, without any signature object. Result: 4,024 R1CS constraints, 128-byte proofs, 52 ms proving time. The construction is proof-system agnostic (Groth16, PLONK, STARKs all work) and deployable today as an AA validator module — no protocol changes required.


Motivation: The PQC data wall

The community has been actively exploring PQC migration paths (13) and the core tension is well-known:

Scheme Sig Size Public Key Total per TX
ML-DSA-44 (Level 2) 2,420 B 1,312 B 3,732 B
ML-DSA-65 (Level 3) 3,309 B 1,952 B 5,261 B
ML-DSA-87 (Level 5) 4,627 B 2,592 B 7,219 B
SLH-DSA-128f 17,088 B 32 B 17,120 B
FN-DSA-512 ~666 B 897 B 1,563 B
Ed25519 (classical) 64 B 32 B 96 B

That’s a 30-60x increase in authorization data per transaction. In rollup architectures where calldata/blob space is explicitly priced, this is a first-order scalability problem.

The “obvious” solution and why it’s expensive

The natural response is: verify PQ signatures inside ZK circuits, post only the succinct proof on-chain.

The problem: lattice-based signature verification requires NTTs over degree-256 polynomial rings in \mathbb{Z}_q (q = 2^{23} - 2^{13} + 1), emulated over a ~254-bit proof-system field. Structural lower bounds:

In-circuit verification R1CS constraints Dominant cost
ML-DSA-44 verify ≥ 2M 4×4 NTTs + non-native mod. arith.
ML-DSA-65 verify ≥ 4M 6×5 NTTs + non-native mod. arith.
FN-DSA-512 verify ≥ 1M FFT + Gram-Schmidt + mod. arith.
SLH-DSA verify ≥ 5M WOTS+ chains + Merkle trees
ECDSA verify (classical) ~1.5M scalar mul. + mod. inverse

Even with optimized gadgets, we’re looking at millions of constraints just to prove “this signature is valid.”

Key observation: authorization ≠ signatures

Here’s the thing: at the consensus layer, blockchains don’t actually require verification of a specific signature object. What consensus requires is assurance that a transaction was authorized by the correct entity.

Signatures are an implementation artifact for expressing authorization — not authorization itself. We’ve been conflating the two.

ZK-ACE: identity-centric authorization

We present ZK-ACE (Zero-Knowledge Authorization for Cryptographic Entities), which takes this observation to its logical conclusion:

Don’t verify signatures in ZK. Don’t compress signatures. Eliminate signature objects from the authorization path entirely.

Instead, the chain stores a compact identity commitment (32 bytes):

$$ID_{com} = H(REV | salt | domain)$$

where REV is a 256-bit identity root derived from a deterministic identity derivation primitive (DIDP). Each transaction carries a ZK proof attesting:

  1. (C1) Commitment consistency: Prover knows a preimage of ID_{com}
  2. (C2) Derivation correctness: A target-binding hash is consistent with deterministic key derivation under the identity root
  3. (C3) Authorization binding: The identity root has authorized this specific TxHash
  4. (C4) Anti-replay: Nonce commitment or nullifier is correctly derived
  5. (C5) Domain separation: All bindings use the declared chain/domain tag

The entire circuit is 5 Poseidon hash invocations + equality constraints. No lattice arithmetic. No signature verification logic. No non-native field emulation.

Benchmarks (reference implementation)

Implementation: arkworks + Groth16 over BN254, Poseidon (t=3, \alpha=17, 8 full + 57 partial rounds).

Circuit size:

Constraint Inputs Hash calls R1CS
(C1) Commitment consistency 3 1 805
(C2) Derivation correctness 4+1 2 1,200
(C3) Authorization binding 7 1 1,615
(C4) Replay prevention 2 1 400
(C5) Domain sep. + enforce_equal 4
Total 5 4,024

Both replay modes (nonce-registry and nullifier-set) produce identical constraint counts.

Performance (single-threaded, Apple M3 Pro, Criterion.rs, 100 samples):

Operation Median 95% CI
Trusted setup (one-time) 45.6 ms [45.4, 45.8] ms
Prove (per transaction) 52.3 ms [51.5, 53.4] ms
Verify (per transaction) 604 μs [600, 608] μs

Proof size:

Encoding Proof Public inputs Total auth data
Compressed Groth16 128 B 160 B (5 × 32 B) 288 B

Compression vs. PQ signatures:

Scheme PQ sig+pk ZK-ACE Reduction
ML-DSA-44 (Level 2) 3,732 B 288 B 13x (92.3%)
ML-DSA-65 (Level 3) 5,261 B 288 B 18x (94.5%)
ML-DSA-87 (Level 5) 7,219 B 288 B 25x (96.0%)
SLH-DSA-128f 17,120 B 288 B 59x (98.3%)

Constraint comparison (the core result):

Approach R1CS constraints
ZK-ACE (this work) 4,024
In-circuit ML-DSA-44 verify ≥ 2,000,000
In-circuit ECDSA verify ~1,500,000

That’s a ~500x constraint reduction — not from optimizing signature verification, but from not doing it at all.

Deployment: AA validator module

ZK-ACE is designed as an ERC-4337 validator module. In an AA wallet:

  • The account validation logic invokes ZK-ACE verification instead of checking a classical or PQ signature
  • Proof generation happens client-side (~52 ms)
  • The bundler transports proof + public inputs (untrusted, learns nothing)
  • On-chain verification costs ~604 μs per proof

This means no protocol-level changes are required. ZK-ACE can be deployed on existing infrastructure.

Proof-system agnostic by design

An important design property: ZK-ACE is a protocol-level authorization model, not a proof-system-specific construction. The five constraints (C1)–(C5) are stated over abstract hash evaluations and equality checks. They can be instantiated with:

Proof system Setup Proof size Trade-off
Groth16 (reference impl.) Trusted (per-circuit) 128 B Smallest proof, fastest verify
PLONK / KZG Universal (one-time) ~400–600 B No per-circuit setup
STARKs / FRI Transparent (none) ~40–100 KB No trusted setup, plausibly PQ-secure
Bulletproofs / IPA Transparent ~700 B No setup, larger verify cost

The benchmarks above use Groth16 because it gives the tightest numbers, but the protocol doesn’t depend on it. In particular, a STARK instantiation would make the entire authorization pipeline plausibly post-quantum at the proof layer as well — no trusted setup, no pairing assumptions, hash-based soundness only. The identity commitments are proof-system-agnostic (they’re just hash outputs), so migrating from one proof system to another does not require identity rotation or re-registration.

The security reductions in the paper are stated generically in terms of knowledge-soundness advantage \text{Adv}^{ks} and are compatible with any backend satisfying completeness, knowledge soundness, zero-knowledge, and public-input binding.

Assumed primitive: DIDP

ZK-ACE assumes a Deterministic Identity Derivation Primitive (DIDP) as a black box — any framework providing:

  • Deterministic key derivation from a high-entropy root
  • Context isolation across derivation paths
  • Identity-root recovery hardness

This is not tied to any specific construction. A simple HKDF(root, context) satisfies the interface. We provide ACE-GF as an instantiation; any KDF with domain separation works.

Security

Four game-based security definitions with reduction-based proofs under standard assumptions:

  • Authorization soundness → reduces to knowledge soundness + collision resistance + DIDP recovery hardness
  • Replay resistance → reduces to authorization soundness + verifier enforcement
  • Substitution resistance → reduces to public-input binding of the proof system
  • Cross-domain separation → reduces to collision resistance + public-input binding

Full proofs in the paper.

What this is NOT

To be explicit:

  • Not ZK-verification of PQ signatures (we don’t verify any signature inside the circuit)
  • Not signature compression (we eliminate signatures, not shrink them)
  • Not a new signature scheme
  • Not dependent on any specific identity framework

It’s a change in what we prove: from “this signature is valid” to “this identity authorized this transaction.”

Relation to existing discussions

This work connects to the ongoing PQC migration discourse:

These approaches all preserve the signature-centric model. ZK-ACE asks: what if we don’t?


Paper: ZK-ACE: Identity-Centric Zero-Knowledge Authorization for Post-Quantum Blockchain Systems

Reference implementation: github.com/ya-xyz/zk-ace

3 Likes

Brilliant write-up and a very elegant paradigm shift away from signature-centric verification. The 4,024 R1CS constraint reduction is massive for AA validator modules.

I want to specifically touch on your point regarding the STARKs / FRI instantiation being the ideal path for a fully transparent, plausibly PQ-secure authorization layer. Historically, the pushback against using STARKs for client-side or decentralized AA proving has been the severe hardware requirements and proving latency at scale.

I recently open-sourced the Qingming ZKP Engine, which directly attacks this FRI proving bottleneck using consumer-grade AMD GPUs (ROCm/HIP), and I believe it could make the STARK instantiation of ZK-ACE highly practical today without needing enterprise clusters.

By taking over the unsafe pointer lifecycle in Rust to bypass standard memory transfers and mapping arrays directly to the AMD 96MB Infinity Cache (Zero-Copy), alongside mathematically reducing the Fermat modular inversions in the fold loop into O(1) scalar multiplications (Zero-Inversions), we achieved the following on a single $999 RX 7900 XTX:

  • NTT (2242^{24}224 scale): 18.94 ms

  • Merkle Tree (L0): 763.3 ms

  • FRI Prove (End-to-End, 16.7M leaves): 2.56 s

If your ZK-ACE identity commitments and authorization bindings were instantiated over a Goldilocks field STARK, the proving time on consumer hardware would be virtually instantaneous with this engine.

Would love for you or anyone in the AA/PQC research space to check out the host-side benchmark logic. This kind of hardware-layer dimensionality reduction might perfectly complement the architectural dimensionality reduction you just presented.

GitHub: qingming-zkp

This is an amazing benchmark — phenomenal work on the Zero-Copy + Zero-Inversions approach. The 2.56s end-to-end FRI prove on a consumer RX 7900 XTX is a game-changer for decentralized proving.

Your Qingming ZKP engine and ZK-ACE form a strong complementary pair — your hardware-layer dimensionality reduction directly addresses the proving latency bottleneck we identified as the primary trade-off when choosing STARKs over Groth16.

Some context on where this could plug in: I’m building an MVP for a new L1 blockchain with an n-VM runtime architecture that natively executes EVM (revm Shanghai), SVM (Solana), BVM (Bitcoin Script), and TVM — all within a unified state tree. The chain runs dual-algorithm native cryptography: classical Ed25519 and post-quantum ML-DSA-44 (FIPS 204) in parallel at every protocol level.

Based on our architectural modeling, the n-VM runtime projects the following throughput estimates:

  • Theoretical EVM ceiling: 20,000–100,000 TPS (single-core to parallel scheduling)
  • Projected sustained (simple transfers): 3,000–5,000 TPS
  • Projected sustained (complex contracts, e.g. DEX swaps): 500–1,500 TPS
  • Projected sustained (mixed workload): 1,000–3,000 TPS

These figures are before ZK proving enters the critical path. If Qingming’s GPU-accelerated FRI could handle our per-block STARK proof generation at sub-second latency on consumer hardware, it would remove the last major bottleneck standing between our architecture and true sub-second cryptographic hard finality — without requiring enterprise GPU clusters.

Would love to explore integration. The combination of your hardware-layer optimization with our protocol-layer identity–authorization separation could be quite compelling.

P.S. — Actually we’ve already implemented the MVP: 3-node devnet with block production, on-chain ML-DSA-44 signed transfers, leader rotation, and a browser-based explorer. One observation from our architectural analysis: because ML-DSA-44 verification (~50μs) is actually faster than Ed25519 (~76μs), and our ZK-ACE attestation model eliminates per-transaction signature verification from the critical path entirely, our runtime can achieve throughput comparable to or exceeding current Solana mainnet TPS — even when every transaction is authorized with post-quantum credentials. To our knowledge, this would make it the first blockchain architecture where post-quantum cryptography imposes zero performance penalty relative to classical algorithms, potentially making it production-viable as a PQC-native L1 today rather than as a future migration target.

Beyond ZK proving, we’re facing a number of Rust-level performance optimization challenges across the runtime — state I/O, parallel scheduling, gossipsub propagation. Having looked through your Qingming codebase, your expertise in low-level Rust + GPU optimization is exactly the kind of skill set that could accelerate this work significantly. Would it be okay if I DM you?

1 Like

Update: Dual-Backend Implementation with Circle STARK (Post-Quantum Secure)

Since the original post, ZK-ACE has been significantly rearchitected. The key updates:

1. Pluggable dual-backend architecture is now implemented and benchmarked.

The original post described a Groth16-only prototype. The reference implementation now ships two compile-time selectable backends:

Aspect Circle STARK (Stwo) — default Groth16/BN254
Field Mersenne-31 BN254 Fr (~254-bit)
Hash Poseidon2 (width=16) Poseidon (width=3)
Constraints ~240 AIR ~1,200 R1CS
Prove 21 ms 44 ms
Verify 1.1 ms 1.5 ms
Proof size ~105 KB 128 B
PQ-secure Yes No
Setup Transparent Trusted

Benchmarked on Apple Silicon, Criterion.rs medians, single-threaded.

2. Constraint count correction. The original post reported 4,024 R1CS constraints from an early prototype. The production Groth16 circuit is ~1,200 R1CS. The STARK backend compiles to ~240 AIR constraints. Both remain roughly three orders of magnitude smaller than in-circuit ML-DSA verification.

3. The STARK backend is faster than Groth16. This is counterintuitive but follows directly from field arithmetic: M31 operations (31-bit) are an order of magnitude cheaper than BN254 (254-bit), and STARK proving requires no elliptic curve MSMs. For small circuits, this advantage dominates.

4. On-chain cost with mandatory STARK aggregation:

Individual STARK proofs (~105 KB) never go on-chain. The block builder aggregates all per-transaction proofs into a single batch proof. Per-transaction on-chain data is only the public inputs:

Model Per-tx on-chain
ML-DSA-65 (direct) ~5,261 B
ZK-ACE STARK (aggregated) ~160 B
ZK-ACE Groth16 ~288 B

This is a 32x compression vs ML-DSA-65 under the STARK model, with full post-quantum security and transparent setup.

5. The entire authorization path is now PQ-secure. Under the STARK backend, identity commitment → proof generation → on-chain verification relies only on hash functions (Poseidon2 + Blake2s). No elliptic curve assumptions anywhere.

The Groth16 backend remains available for EVM-native deployments where proof compactness matters more than PQ security.

Code: github.com/acechain-io/zk-ace
Paper: arXiv:2603.07974

Basically STARKing an HMAC? Cool!