What if post-quantum Ethereum doesn’t need signatures at all?

What if post-quantum Ethereum doesn’t need signatures at all?

TL;DR: Current PQC migration plans assume we must verify post-quantum signatures — either on-chain (kilobytes per tx) or inside ZK circuits (millions of constraints). We present an alternative: prove authorization semantics directly in ZK, without any signature object. Result: 4,024 R1CS constraints, 128-byte proofs, 52 ms proving time. The construction is proof-system agnostic (Groth16, PLONK, STARKs all work) and deployable today as an AA validator module — no protocol changes required.


Motivation: The PQC data wall

The community has been actively exploring PQC migration paths (13) and the core tension is well-known:

Scheme Sig Size Public Key Total per TX
ML-DSA-44 (Level 2) 2,420 B 1,312 B 3,732 B
ML-DSA-65 (Level 3) 3,309 B 1,952 B 5,261 B
ML-DSA-87 (Level 5) 4,627 B 2,592 B 7,219 B
SLH-DSA-128f 17,088 B 32 B 17,120 B
FN-DSA-512 ~666 B 897 B 1,563 B
Ed25519 (classical) 64 B 32 B 96 B

That’s a 30-60x increase in authorization data per transaction. In rollup architectures where calldata/blob space is explicitly priced, this is a first-order scalability problem.

The “obvious” solution and why it’s expensive

The natural response is: verify PQ signatures inside ZK circuits, post only the succinct proof on-chain.

The problem: lattice-based signature verification requires NTTs over degree-256 polynomial rings in \mathbb{Z}_q (q = 2^{23} - 2^{13} + 1), emulated over a ~254-bit proof-system field. Structural lower bounds:

In-circuit verification R1CS constraints Dominant cost
ML-DSA-44 verify ≥ 2M 4×4 NTTs + non-native mod. arith.
ML-DSA-65 verify ≥ 4M 6×5 NTTs + non-native mod. arith.
FN-DSA-512 verify ≥ 1M FFT + Gram-Schmidt + mod. arith.
SLH-DSA verify ≥ 5M WOTS+ chains + Merkle trees
ECDSA verify (classical) ~1.5M scalar mul. + mod. inverse

Even with optimized gadgets, we’re looking at millions of constraints just to prove “this signature is valid.”

Key observation: authorization ≠ signatures

Here’s the thing: at the consensus layer, blockchains don’t actually require verification of a specific signature object. What consensus requires is assurance that a transaction was authorized by the correct entity.

Signatures are an implementation artifact for expressing authorization — not authorization itself. We’ve been conflating the two.

ZK-ACE: identity-centric authorization

We present ZK-ACE (Zero-Knowledge Authorization for Cryptographic Entities), which takes this observation to its logical conclusion:

Don’t verify signatures in ZK. Don’t compress signatures. Eliminate signature objects from the authorization path entirely.

Instead, the chain stores a compact identity commitment (32 bytes):

$$ID_{com} = H(REV | salt | domain)$$

where REV is a 256-bit identity root derived from a deterministic identity derivation primitive (DIDP). Each transaction carries a ZK proof attesting:

  1. (C1) Commitment consistency: Prover knows a preimage of ID_{com}
  2. (C2) Derivation correctness: A target-binding hash is consistent with deterministic key derivation under the identity root
  3. (C3) Authorization binding: The identity root has authorized this specific TxHash
  4. (C4) Anti-replay: Nonce commitment or nullifier is correctly derived
  5. (C5) Domain separation: All bindings use the declared chain/domain tag

The entire circuit is 5 Poseidon hash invocations + equality constraints. No lattice arithmetic. No signature verification logic. No non-native field emulation.

Benchmarks (reference implementation)

Implementation: arkworks + Groth16 over BN254, Poseidon (t=3, \alpha=17, 8 full + 57 partial rounds).

Circuit size:

Constraint Inputs Hash calls R1CS
(C1) Commitment consistency 3 1 805
(C2) Derivation correctness 4+1 2 1,200
(C3) Authorization binding 7 1 1,615
(C4) Replay prevention 2 1 400
(C5) Domain sep. + enforce_equal 4
Total 5 4,024

Both replay modes (nonce-registry and nullifier-set) produce identical constraint counts.

Performance (single-threaded, Apple M3 Pro, Criterion.rs, 100 samples):

Operation Median 95% CI
Trusted setup (one-time) 45.6 ms [45.4, 45.8] ms
Prove (per transaction) 52.3 ms [51.5, 53.4] ms
Verify (per transaction) 604 μs [600, 608] μs

Proof size:

Encoding Proof Public inputs Total auth data
Compressed Groth16 128 B 160 B (5 × 32 B) 288 B

Compression vs. PQ signatures:

Scheme PQ sig+pk ZK-ACE Reduction
ML-DSA-44 (Level 2) 3,732 B 288 B 13x (92.3%)
ML-DSA-65 (Level 3) 5,261 B 288 B 18x (94.5%)
ML-DSA-87 (Level 5) 7,219 B 288 B 25x (96.0%)
SLH-DSA-128f 17,120 B 288 B 59x (98.3%)

Constraint comparison (the core result):

Approach R1CS constraints
ZK-ACE (this work) 4,024
In-circuit ML-DSA-44 verify ≥ 2,000,000
In-circuit ECDSA verify ~1,500,000

That’s a ~500x constraint reduction — not from optimizing signature verification, but from not doing it at all.

Deployment: AA validator module

ZK-ACE is designed as an ERC-4337 validator module. In an AA wallet:

  • The account validation logic invokes ZK-ACE verification instead of checking a classical or PQ signature
  • Proof generation happens client-side (~52 ms)
  • The bundler transports proof + public inputs (untrusted, learns nothing)
  • On-chain verification costs ~604 μs per proof

This means no protocol-level changes are required. ZK-ACE can be deployed on existing infrastructure.

Proof-system agnostic by design

An important design property: ZK-ACE is a protocol-level authorization model, not a proof-system-specific construction. The five constraints (C1)–(C5) are stated over abstract hash evaluations and equality checks. They can be instantiated with:

Proof system Setup Proof size Trade-off
Groth16 (reference impl.) Trusted (per-circuit) 128 B Smallest proof, fastest verify
PLONK / KZG Universal (one-time) ~400–600 B No per-circuit setup
STARKs / FRI Transparent (none) ~40–100 KB No trusted setup, plausibly PQ-secure
Bulletproofs / IPA Transparent ~700 B No setup, larger verify cost

The benchmarks above use Groth16 because it gives the tightest numbers, but the protocol doesn’t depend on it. In particular, a STARK instantiation would make the entire authorization pipeline plausibly post-quantum at the proof layer as well — no trusted setup, no pairing assumptions, hash-based soundness only. The identity commitments are proof-system-agnostic (they’re just hash outputs), so migrating from one proof system to another does not require identity rotation or re-registration.

The security reductions in the paper are stated generically in terms of knowledge-soundness advantage \text{Adv}^{ks} and are compatible with any backend satisfying completeness, knowledge soundness, zero-knowledge, and public-input binding.

Assumed primitive: DIDP

ZK-ACE assumes a Deterministic Identity Derivation Primitive (DIDP) as a black box — any framework providing:

  • Deterministic key derivation from a high-entropy root
  • Context isolation across derivation paths
  • Identity-root recovery hardness

This is not tied to any specific construction. A simple HKDF(root, context) satisfies the interface. We provide ACE-GF as an instantiation; any KDF with domain separation works.

Security

Four game-based security definitions with reduction-based proofs under standard assumptions:

  • Authorization soundness → reduces to knowledge soundness + collision resistance + DIDP recovery hardness
  • Replay resistance → reduces to authorization soundness + verifier enforcement
  • Substitution resistance → reduces to public-input binding of the proof system
  • Cross-domain separation → reduces to collision resistance + public-input binding

Full proofs in the paper.

What this is NOT

To be explicit:

  • Not ZK-verification of PQ signatures (we don’t verify any signature inside the circuit)
  • Not signature compression (we eliminate signatures, not shrink them)
  • Not a new signature scheme
  • Not dependent on any specific identity framework

It’s a change in what we prove: from “this signature is valid” to “this identity authorized this transaction.”

Relation to existing discussions

This work connects to the ongoing PQC migration discourse:

These approaches all preserve the signature-centric model. ZK-ACE asks: what if we don’t?


Paper: ZK-ACE: Identity-Centric Zero-Knowledge Authorization for Post-Quantum Blockchain Systems

Reference implementation: github.com/ya-xyz/zk-ace

7 Likes

Brilliant write-up and a very elegant paradigm shift away from signature-centric verification. The 4,024 R1CS constraint reduction is massive for AA validator modules.

I want to specifically touch on your point regarding the STARKs / FRI instantiation being the ideal path for a fully transparent, plausibly PQ-secure authorization layer. Historically, the pushback against using STARKs for client-side or decentralized AA proving has been the severe hardware requirements and proving latency at scale.

I recently open-sourced the Qingming ZKP Engine, which directly attacks this FRI proving bottleneck using consumer-grade AMD GPUs (ROCm/HIP), and I believe it could make the STARK instantiation of ZK-ACE highly practical today without needing enterprise clusters.

By taking over the unsafe pointer lifecycle in Rust to bypass standard memory transfers and mapping arrays directly to the AMD 96MB Infinity Cache (Zero-Copy), alongside mathematically reducing the Fermat modular inversions in the fold loop into O(1) scalar multiplications (Zero-Inversions), we achieved the following on a single $999 RX 7900 XTX:

  • NTT (2242^{24}224 scale): 18.94 ms

  • Merkle Tree (L0): 763.3 ms

  • FRI Prove (End-to-End, 16.7M leaves): 2.56 s

If your ZK-ACE identity commitments and authorization bindings were instantiated over a Goldilocks field STARK, the proving time on consumer hardware would be virtually instantaneous with this engine.

Would love for you or anyone in the AA/PQC research space to check out the host-side benchmark logic. This kind of hardware-layer dimensionality reduction might perfectly complement the architectural dimensionality reduction you just presented.

GitHub: qingming-zkp

This is an amazing benchmark — phenomenal work on the Zero-Copy + Zero-Inversions approach. The 2.56s end-to-end FRI prove on a consumer RX 7900 XTX is a game-changer for decentralized proving.

Your Qingming ZKP engine and ZK-ACE form a strong complementary pair — your hardware-layer dimensionality reduction directly addresses the proving latency bottleneck we identified as the primary trade-off when choosing STARKs over Groth16.

Some context on where this could plug in: I’m building an MVP for a new L1 blockchain with an n-VM runtime architecture that natively executes EVM (revm Shanghai), SVM (Solana), BVM (Bitcoin Script), and TVM — all within a unified state tree. The chain runs dual-algorithm native cryptography: classical Ed25519 and post-quantum ML-DSA-44 (FIPS 204) in parallel at every protocol level.

Based on our architectural modeling, the n-VM runtime projects the following throughput estimates:

  • Theoretical EVM ceiling: 20,000–100,000 TPS (single-core to parallel scheduling)
  • Projected sustained (simple transfers): 3,000–5,000 TPS
  • Projected sustained (complex contracts, e.g. DEX swaps): 500–1,500 TPS
  • Projected sustained (mixed workload): 1,000–3,000 TPS

These figures are before ZK proving enters the critical path. If Qingming’s GPU-accelerated FRI could handle our per-block STARK proof generation at sub-second latency on consumer hardware, it would remove the last major bottleneck standing between our architecture and true sub-second cryptographic hard finality — without requiring enterprise GPU clusters.

Would love to explore integration. The combination of your hardware-layer optimization with our protocol-layer identity–authorization separation could be quite compelling.

P.S. — Actually we’ve already implemented the MVP: 3-node devnet with block production, on-chain ML-DSA-44 signed transfers, leader rotation, and a browser-based explorer. One observation from our architectural analysis: because ML-DSA-44 verification (~50μs) is actually faster than Ed25519 (~76μs), and our ZK-ACE attestation model eliminates per-transaction signature verification from the critical path entirely, our runtime can achieve throughput comparable to or exceeding current Solana mainnet TPS — even when every transaction is authorized with post-quantum credentials. To our knowledge, this would make it the first blockchain architecture where post-quantum cryptography imposes zero performance penalty relative to classical algorithms, potentially making it production-viable as a PQC-native L1 today rather than as a future migration target.

Beyond ZK proving, we’re facing a number of Rust-level performance optimization challenges across the runtime — state I/O, parallel scheduling, gossipsub propagation. Having looked through your Qingming codebase, your expertise in low-level Rust + GPU optimization is exactly the kind of skill set that could accelerate this work significantly. Would it be okay if I DM you?

1 Like

Update: Dual-Backend Implementation with Circle STARK (Post-Quantum Secure)

Since the original post, ZK-ACE has been significantly rearchitected. The key updates:

1. Pluggable dual-backend architecture is now implemented and benchmarked.

The original post described a Groth16-only prototype. The reference implementation now ships two compile-time selectable backends:

Aspect Circle STARK (Stwo) — default Groth16/BN254
Field Mersenne-31 BN254 Fr (~254-bit)
Hash Poseidon2 (width=16) Poseidon (width=3)
Constraints ~240 AIR ~1,200 R1CS
Prove 21 ms 44 ms
Verify 1.1 ms 1.5 ms
Proof size ~105 KB 128 B
PQ-secure Yes No
Setup Transparent Trusted

Benchmarked on Apple Silicon, Criterion.rs medians, single-threaded.

2. Constraint count correction. The original post reported 4,024 R1CS constraints from an early prototype. The production Groth16 circuit is ~1,200 R1CS. The STARK backend compiles to ~240 AIR constraints. Both remain roughly three orders of magnitude smaller than in-circuit ML-DSA verification.

3. The STARK backend is faster than Groth16. This is counterintuitive but follows directly from field arithmetic: M31 operations (31-bit) are an order of magnitude cheaper than BN254 (254-bit), and STARK proving requires no elliptic curve MSMs. For small circuits, this advantage dominates.

4. On-chain cost with mandatory STARK aggregation:

Individual STARK proofs (~105 KB) never go on-chain. The block builder aggregates all per-transaction proofs into a single batch proof. Per-transaction on-chain data is only the public inputs:

Model Per-tx on-chain
ML-DSA-65 (direct) ~5,261 B
ZK-ACE STARK (aggregated) ~160 B
ZK-ACE Groth16 ~288 B

This is a 32x compression vs ML-DSA-65 under the STARK model, with full post-quantum security and transparent setup.

5. The entire authorization path is now PQ-secure. Under the STARK backend, identity commitment → proof generation → on-chain verification relies only on hash functions (Poseidon2 + Blake2s). No elliptic curve assumptions anywhere.

The Groth16 backend remains available for EVM-native deployments where proof compactness matters more than PQ security.

Code: github.com/acechain-io/zk-ace
Paper: arXiv:2603.07974

Basically STARKing an HMAC? Cool!

Yes. It’s super cool actually. we elimiated PQC performance penalty completely. We reached 570+ TPS on both ed25519 and ML-DSA-44 on the local 3-node devnet on a MacBook Pro M3 with 12 core and36G RAM. I believe we are the first one that can eliminate PQC performance penalty and reach such high TPS on a commodity device.

I think the biggest innovation coming from your proposal is that you’re connecting two different areas of cryptography that don’t seem to have managed to talk to each other so far. Up until now it was common knowledge that an HMAC scheme can replace signatures only in a symmetric setting because both the “signer” and the verifier need to know the secret key, but zkS{N,T}ARKs change that completely indeed! Now a verifier can reliably verify an HMAC without having to know the secret key, just like in asymmetric cryptography!

The resulting “signature” scheme will probably never be as efficient as an actual signature: Dilithium signatures are very large but zkSTARKs are even larger, unfortunately. But this idea has the advantage of being highly aggregatable: if all transaction signatures are “STARK’d HMACs” you get to prove a whole block with a single zkSTARK easy. While verifying Dilithium in AIR… good luck with that.

Very cool idea, thanks for sharing it!

I’ll be honest, I seem to be somewhat of a competitor of yours. :slight_smile: I co-founded a brand new, post-quantum L1 project and after reading this thread we’ve totally decided to follow this pattern.

PS: in our project we’re gonna call this pattern “zkMAC”. The name fits much better than “zkACE”.

Hi 71004,

Glad to see you are interested in adopting this technology!

Since you are looking to integrate this pattern into an L1 project, instead of fragmenting the effort with a separate implementation, I’d like to invite you to contribute directly to the original ACE repository.

Collaborating on the upstream source is the best way to support the open-source community, and it ensures the tech stays standardized and robust for everyone. It would be much more efficient for the ecosystem to have a single, well-audited implementation than multiple divergent ones.

Let’s work together to make ACE the standard for zk-based MACs — I actually think “zkMAC” is a great technical descriptor for this construction! What do you think?

1 Like

I’ve been re-reading this part, why are 5 hashes needed? It looks to me like 2 are sufficient: one to compute the identity commitment and another one to HMAC the message.

I’ve recently finished implementing this scheme in my project Libernet and that’s how I did it. This is the reusable chip: crypto/src/hmac.rs at d653000d3f3d21c671fac8cb6857e86afd677460 · libernet-mirror/crypto · GitHub

(I used a terminology that resembles actual signatures: I called your “REV” simply “private key” and your “identity commitment” simply “public key”.)

1 Like

Deleted previous one by mistake.

Thanks for the rigorous engineering insights, Alberto! The 2-call variant utilizing Poseidon’s sequential Absorb state is a fantastic optimization and significantly reduces the R1CS constraints.

Regarding the terminology, I noticed you referred to REV as ‘private key’ and ID_{com} as ‘public key’ in Libernet. While I understand the intuition behind using classical signature terms for familiarity, I strongly recommend adhering to the original ZK-ACE terminology (REV / Identity Commitment). >

Introducing standard key pairs (PK/SK) syntax into an identity-centric authorization path can cause major conceptual confusion for developers. In ZK-ACE, the authorization semantics are explicitly decoupled from traditional signature verification objects.

To ensure we build a unified, interoperable standard for the broader post-quantum Ethereum ecosystem without fragmentation, let’s keep the codebase and implementation aligned under the ZK-ACE architecture. Looking forward to benchmarking your optimized pipeline further!

Hi Citrullin,
Can you give me any clue on how to apply this to Social Dynamics. I’m interested in social science very much. I’m drafting some papers in social science actually. will share with you when published

1 Like

I was looking at this table again. I’m not sure that Poseidon2 is still quantum-safe when it runs over such a small field as Mersenne-31, even if you’re using a state vector width of 16. For sure it would be utterly broken if the state width was 2 (even a classical computer can break that in a matter of seconds), so the question is whether or not the 16 elements of the state vector can be attacked individually, and whether or not Grover’s algorithm provides any advantages in that regard. I don’t know Poseidon2 well enough to speak for that, but for the best level of quantum-safety I think it’s best to use a 256-bit field.

Another odd choice you made is AIR arithmetization. AIR takes 4 extra columns for the memory argument, but for “zkMAC” the only equality constraint you really need is the one proving that the secret key used to derive the identity commitment is the same used to hash the HMAC. Memory is a complete overkill here, I think the permutation argument is much better in this case. You’d probably get better results with TurboPLONK.

Okay, I’ve just reviewed the original Poseidon2 paper. The proposed instantiations are on page 16, table 1, and the paper targets 128-bit classical security.

The classical security of Poseidon2 is generally \frac{c}2 bits, where c is the total bit size of the capacity columns. Under Grover that becomes \frac{c}4.

Your instantiation is the one in the first row of the table, with c = 8 \cdot 31 = 248, making for ~124 bits of classical security and ~62 bits under Grover. That’s insufficient.

The instantiation at the second row of the table adds 8 more capacity columns and works much better for quantum safety: now we have c = 16 \cdot 31 = 496, so everything is doubled – ~248 bits of classical security and 124 bits under Grover. ✓

You may want to re-run your numbers for quantum safety. You need to add 8 columns and they aren’t gonna be free – it’s 8 more NTTs. The number of partial rounds also jumps to 22, so there will be more constraints and you’ll almost certainly exceed the next power of 2 and double your circuit size.