Parallelilizing EVM through end-of-the-block virtual transactions

Currently EVM processes transaction sequentially. While for the main net it is not a big deal since the blocks come slowly anyway, it is a problem for us at SKALE.

One would think that if transaction X modifies the state of a contract A and transaction Y modifies the state of contract B, then X and Y could be processed in parallel.

The problem is both A and B may call contract C. This couples the state of A and B and prevents parallel processing.

Here is how this can be solved:

When A wants to call C, instead of calling it directly, it creates a virtual transaction Z at the end of the block. This transaction calls C.

When all the original transactions are processed, there is a set of virtual transactions at the end of the block S1. These transactions are then ordered by hash, and processed as a virtual block B1. This can lead to another vritual block B2, which is in turn processed by EVM.

Ultimately the processing converges, creating an empty virtual block.

Note, that at every processing stage figuring out parallelism is easy. Since there are now subcontract calls, two transactions that call two different contracts and have two different senders can be processed in parallel.

1 Like

There is a ton of prior art on these designs.

like and

It’s mostly a social coordination problem of getting multiple users of EVM on BFT platforms to agree on an extension to the EVM spec rather than a research/design problem at this point.


I was under the impression that the limiting factor is not compute but disk read and write. So these approaches were not attractive because it does not address the root problem which is loading the state so you can change it.

See from on for discussion


I am hoping that these kind of things will become interesting if we manage to introduce ReGenesis and limit the active state to something that can fit into RAM, and then we will have some interesting EVM optimisation problems, looking forward to it.
Having said that, we are already coming close to these bottleneck in some special cases, like running transactions in bulk when performing turbo-geth sync on a machine with the decent amount of memory. Since our representation of current State is about 50Gb, and the active part can actually fit into RAM on 32 Gb RAM machine, we are already getting to the point where we see bottlenecks in golang’s garbage collect, or in Jumpdest analysis etc.