On-chain scaling to potentially ~500 tx/sec through mass tx validation


How does the ZKSNARK handle incremental changes to the Merkle tree - what gets recomputed if the tree changes?

I wonder if using things like Merkle mountain ranges or some other more advanced crypto accumulators could improve the prover computation speed …

Also how about letting miners compute the proofs and count ZKSNARKS towards the PoW?:wink: It also removes the problem of reserving the slots. Miners make a lot of money per block, they can just buy lots of machnines on AWS to calculate the proof in several seconds


How does the ZKSNARK handle incremental changes to the Merkle tree - what gets recomputed if the tree changes?

Suppose the transaction modifies the balances of accounts A[1]…A[n]. Basically, the ZK-SNARK should prove that there exists a witness W that proves the pre-state prev_A[1] … prev_A[n] is part of the Merkle root in the current state, and that if you recalculate the new Merkle root using the data in W and the given modifications, the result is whatever is specified as the new merkle root. All of this needs to be done inside the SNARK.


Nice, so the randomness needed to pick the points in the proof comes from the next block hash, a source the prover not does control or know in advance.

Why would you only publish the merkle root in step (i)? If you would publish all the transactions in step (i) you are able to include more transactions in a single block, because the proof is coming in a seperate block.

This can easily be incorporated in this scheme above, because the proofs are already coming at a later time.


Ah, I see. We could pass a bitmap of participants and add up public keys from storage to obtain the compound key pretty efficiently in EVM, leaving us with just one BLS verification. Right? For 1000 transactions that would indeed spare a lot of gas even without precompile. Cool!

Then the last problem: if you always submit the full CT, who is going to pay gas for unexecuted transactions (68 per byte of tx)? Plus a static overhead of BLS signature verification (200-300K).


Ah sorry I meant to say publish all the transactions and the merkle root in step 1, then the STARK in step 2.



I am not familiar with zk-snark and have some basic questions. In this case, the verifier is a smart contract on Ethereum blockchain right? As far as I understand, there is a random sampling seed lambda which should be generated by the verifier and should not be exposed to the prover (relayer). If the verifier code is on the blockchain, how do we prevent relayers from accessing the lambda and generating a fake proof?



This sounds much and does not require zkSNARKS or am I missing something here?


Hey, guys. Is anyone has already implemented this? we wanted to take it as a hackathon tasks for ETHSanFrancisco.

Vitalik's scaling without SNARKs at all

Hey, guy. Is anyone has already implemented this? we wanted to take it as a hackathon tasks for ETHSanFrancisco

So https://github.com/barryWhiteHat/roll_up is pretty close to what you will need to build it. All you have to do is add the sha256 compression in the EVM and the snark. I am happy to help remotely. we can do a call about it. If you join our gitter we can coord :slight_smile: we also have a bunch of people there who may want to join your effort.

We are also building our own proposal to add data availability guarantees outside the EVM but the core of what we need to build will be the same.


I was asking my colleagues and one of them pointed me to the following article:

So, a manual one time setup process will solve the problem, or I guess existing zcash public parameters can be reused…

Edit: I figured it’s not possible to reuse zcash parameters…I am eager to join if we do something similar!


Yes. The main bottleneck in STARK is an FFT on a domain of size O(N), where N = number of tx’s. As known, the FFT algorithm is parallelizable.
Btw, in a SNARK, for large instances, FFT will be the main bottleneck as well.

In addition to our talk in Devcon4, we are going to have a workshop outside of it. It will be held on the morning of Nov 1st. More details will be published at a later date.

Yes. We are actively working on reducing the proof size to make it reasonable to be sent on chain. We are already at the point where a STARK proof, even for a large batch of transactions, will fit in a block (transmission cost well under 8M gas). Now that the proof size is in reasonable limits (and will go down even further as we keep working on it), we can start utilizing STARKs for scaling up different applications on Ethereum.
We shall include a brief description of the current state of research in our first quarterly report to the Ethereum Foundation.


I am pretty sure that people will design something like a customized SNARK working with a customized crypto accumulator … Merkle trees and SHA-3 hash

Ok - I will come


Cool! What’s the best way to prepare for the workshop? Trying out https://github.com/elibensasson/libSTARK ? Playing around with TinyRAM asm?


I don’t think it is necesary to send the nonce to the blockchain, since the correct nonce is public knowledge and can be found in the merkle-tree. The user should however still include the nonce in the transaction that is signed. The relayer should proof with the zk-SNARK/STARK that the correct nonce was in the signed transaction.

In a similar manner we could shield the end users from the concepts of address indices.
The end user could send to and from ethereum like addresses that are based on their public key. The relayer should lookup/create the correct address indices and send the address indices to the blockchain. An added advantage is that now the relayers can freely switch around the address indices for efficiency purposes, without bothering the end-users.


True! We can further optimize other parameters by using broader context information:

to: most users only make transfers to a handfull of addresses. Operators can maintain personal address books and add new addresses to them as needed. This can reduce the to parameter from three to one byte in most cases.

amount: I guess that some amounts appear in transfers way more often than the others. For example, $10.00 is likely to be used much more frequently than $13.87. Someone could analyze public transfer records and create a Huffman coding database of most frequently used amounts (lookup being triggered by a special prefix in amount field). Not sure what will be the gains, but a gut feeling tells me that at least one byte can be saved on average.

Maybe from can somehow be optimized based on ‘most recently used’ pattern. Ideas welcome.

This brings us to: 3 (from) + 1 (to) + 2 (amount) = 6 bytes on average, or 408 gas / tx.

The zk circuit is getting pretty complex, so let’s assume we use the new STARKs and post a proof which consumes all block gas limit once per 4 blocks of pure data. This way we can process well over 1000 TPS (and much more when sharding is implemented, achieving the VISA rate).


With the following simple trick we can significantly lower the gas needed per transaction.

  1. We sort the transactions is such a way that similar transactions follow each other.
  2. We send the byte-for-byte difference compared to the previous transaction.

This will result in a lot of zeros when the transactions are very similar, saving a lot of gas (zero bytes are only 4 gas instead of 68)

Note that most transactions are independent of each other, so with thousands of transactions per block, the relayer has a lot of freedom to minimize gas costs.

Extra parameter tricks:

from: Keeping the addresses sorted on the total number of send/received transactions will make sure the most used addresses are more similar. This will probably add too much complexity.
Maybe it is more feasible to just keep a smaller list of most used addresses during the last day/week.

to: Reserving a few addresses in the address space for the most used addresses will make transactions using the address book more similar.

amount: It is more likely that the amounts are in rounded decimal format (1 Eth, 0.5 Eth, 3.2 Eth, etc). Having the amount in 2-base will shuffle the lower bits if you add a round decimal to it. Using a decimal(10-base) mantissa will keep the rounded decimal numbers more similar. I think using 3 bytes including a 3 bit mantissa will do the trick.
Sending the total amount (minus fee) in the account should also be a fixed code.

fee: Fixed code for last used fee.


Could you give some more information on this?

How long was the proving time? How much gas would it cost for the verification?


If multiple people are paying you at the same time it is easy to mix up the payments. Most companies use an invoice number to avoid that problem. So I think we need something like a variable length comment field, maybe starting with one byte for size. When there is no comment, the zero byte will add only 4 gas. Using a comments field will be cheaper than registering a new account for every invoice, besides that, it will save memory in the state tree when there are no throw-way accounts.

For the end user, the registration seems like the most expensive part. Being able to do it via the relayer is an obvious way to save on a main chain transaction. So for example, whenever the relayer receives a transfer to a previously unknown address, he puts an unused address index in the to field and puts the 20 byte address in the comments field. So for 21*68=1428 gas extra you can register a new account.