I’ve been doing some research regarding the Lean spec, specifically the Lean-Multisig; one thing that’s obvious is that PQ is being prioritised first, but also “a minimal zkVM” is being worked on for the sake of XMSS signature aggregation, rather than going for a full-on zkEVM spec at this stage.
The multisig-snarking has some interesting effects in my opinion:
Rather than cryptographic aggregation through pairings like in BLS, Lean is actually computing all of the signatures but proving the validity of the computations as an aggregate; this provides the succinctness of a snark.
Since aggregations are aggregated recursively, signature verification is actually parallelized across the signing-commitee, which is paradigmatically different than pairing-based aggregation, which is just cryptographic verification rather than computational proving.
This of course is often brought to light in the context of SSF, which becomes feasible with regard to that kind of level of efficiency that isn’t present with BLS.
The thing I see nobody talking about is the fact that this in-effect has just parallelised the execution of the beacon-chain, and the implications of that being possible are actually spectacular; allow me try and explain my thinking:
The conceptual separation of the execution and consensus chain is in part due to the disjointed nature of the the beacon-chain computing the consensus-engine’s PoS and blob-verification, while the execution-chain’s STF computes the EVM and world-state; both are in effect separate state-machines, and the beacon machine encapsulates the ethereum virtual machine.
Parallelising the consensus-engine is a big deal because of the level of scalability it brings; in a way SSF is infinite scalability on the consensus-layer. Now ask yourself the question, if theoretically optimal scalability through recursive aggregation of computational proofs could be implemented on the execution-layer, what would that look like? Theoretically a network that totally leverages all available computational resources.
One reason why this kind of parallelisation is possible on the beacon-chain is that validator signatures hold no context: the validity/existence of a signature remains independent of the validity/existence of all other signatures, so they can be infinitely parallelised as they never collide and can thus be computed across disjointed processors. Perfect parallelisation in the execution-layer’s STF is challenging in contrast, as the world-state is a kind of context that could end up colliding across different computations, but how far can we get?
The existence of BALs proves that parallelisation of the EVM is beneficial, but what’s the limit to the number of state-changes that can be parallelised? Well no matter how well you parallelise disjoint transactions you reach the point where the CPU runs out of threads; what if then the disjointed execution moves to other nodes? Here’s how it could work in practice:
Every slot an inclusion-list is established, perhaps similar to EIP-7805; the IL defines metadata denoting an ‘execution-column’ that select txs fall into; these columns all contain txs that are disjoint in their side-effects, and thus can execute in parallel to other column’s txs. Think about it like partitioning txs into equivalence classes based on a side-effect equivalence-relation. Then APS-style delegation can be used to select ‘execution-committees’ to execute certain columns of txs in the IL. Similar to the current Lean-Multisig, these computations are snarkified, and these snarks are aggregated by a committee-leader (or any member) and sent to the proposer of the block, who recursively aggregates the commitees’ snarks into a further snark that proves valid execution across the entire block.
This in effect eliminates the upper-bound that exists in the network currently where, for the sake of decentralisation, any home-computer needs to be able to re-execute any other block; this is true even given full snarkification, as proposers still need to be able to fully execute their own block.
I think if you take this and combine it with EIP-4444 and EIP-7736 history and state expiry, full DAS, and something similar to a verkle-tree, you can potentially leverage close to the entire network’s resources on both the data and execution layers, thus achieving theoretically optimal scalability. You can even optimise it further by implementing new gas calculations that increase costs relative to how parallelizable a transaction is, so that users essentially pay to access more active parts of the world-state. It’s worth noting this multi-threaded gas could have prevented a situation such as the DAO accumulating an erroneous fraction of the total supply of ether, as users would have had to pay increasingly large gas costs as more and more users started mutating the same state in the world-computer; gas costs that pay relative to how parallelizable a tx is thus incentivise account-abstraction and use of one’s own code, which strengthens decentralisation.
In essence the dilemma of a blockchain is that any one node must be capable of following the execution of the entire network in real time, since the only execution you can truly trust is that which happened on your own CPU; with zkEVM parallelism you can solve this bottleneck by using succinct proofs to prove the validity of execution across untrusted nodes executing the chain in parallel, thus achieving theoretically optimal scalability.
Thank you for reading.