On-chain scaling to potentially ~500 tx/sec through mass tx validation

With the baby jubjub curve, it might actually not be that bad.

2 Likes

Checking is a signature is valid for eddsa (with baby jubjub curve) takes about 200k constaints in roll_up, which is maybe 30 seconds to build a proof on a decent laptop. This can most likely be brought down quite a bit still however. Using secp2561curve will be more computationally demanding

400k seems incredibly high. Why is that? If we use schnorr verifying a signature is just two multiplications, an addition and a hash, so on average ~769 additions and a hash, so it seems like the hash would be the largest part of it.

Sorry, I meant 200k, I just edited previous comment. Roll-up currently uses full-rounds of SHA256, so we could cut the number of contraints here almost by half for the hash function. Otherwise, there are a lot of checks done currently that might not be necessary.

Here’s where the contraints are built for the signature validation ;

FWIW I do agree this whole thing is expensive in terms of prover time, though given that I expect the relayers will be GPU farms so it’s less of an issue than it is in, say, zcash where regular nodes need to be able to make proofs in a few seconds.

In the case where you use a GPU farm to perform the proving step do you also need a GPU farm to perform the trusted setup? How do the two operations differ in complexity?

400k seems incredibly high. Why is that? If we use schnorr verifying a signature is just two multiplications, an addition and a hash, so on average ~769 additions and a hash, so it seems like the hash would be the largest part of it.

I use sha256 in a bunch of places because i was unsure what was safe to combine points or remove discard data. But when i optimize it we should be able to come down quite a lot. Its POC at the moment.

1 Like

The correct sum is slightly higher: 68 * 3 (from) + 68 * 3 (to) + 68 * 1 (fee) + 68 * 4 * 2 (amount) + 68 * 2 (nonce), or 1156 gas. A more realistic assumption for the foreseeable future is 1000 tx per SNARK (according to very optimistic calculations we had with the rollup team). So it adds roughly 650 overhead, bringing us to ~1.8k gas per tx.

A batched transfer of tokens with the same parameters (32 bit value, 24 bit address) will cost around 18k gas per transfer. So it’s 10x, not 50x.

10x gas reduction is an order of magintude improvement, of course, but it has a relatively strong cap, and the economic overhead imposed by the SNARK proof computation is also very signficant at the moment. Especially the need to compute hashes optimized for verification in EVM.

Solving the data availability problem through relying on the Ethereum root chain not as 100% data availability guarantor, but rather as the court of the final appeal seems a lot more promising direction, because: 1) it can scale almost indefintely, 2) it simplifies the SNARK circuit, making transfers cheaper and a working MVP a more attainable goal. See our discussion in the roll_up github.

Why? The amount is a single value.

A batched transfer of tokens with the same parameters (32 bit value, 24 bit address) will cost around 18k gas per transfer. So it’s 10x, not 50x.

10x relative to batched transfers. The status quo does not involve batched transfers in most cases; in the 50x I’m including efficiency gains both from SNARK validation and from compression and a batching relay protocol.

Solving the data availability problem through relying on the Ethereum root chain not as 100% data availability guarantor, but rather as the court of the final appeal seems a lot more promising direction, because: 1) it can scale almost indefintely, 2) it simplifies the SNARK circuit, making transfers cheaper and a working MVP a more attainable goal. See our discussion in the roll_up github

I think both directions are valuable; using the ethereum chain as a data availability guarantor makes many other things simpler, and in the long run scalable “sharded” ethereum is basically being designed as a data availability guarantor first and foremost so it’s a quite sustainable long-term direction (see layer 2 execution engines as described here; what I’m describing here is in some sense a candidate for the first layer 2 execution engine). My main short-term concern with Plasma-like constructions is that they depend on specific operators to stay online, and Plasma is inherently non-universal in some respects because of the extra game theoretic requirements.

2 Likes

What I did not understand is how is the correct order of transactions in a batch actually enforced?

What I did not understand is how is the correct order of transactions in a batch actually enforced?

The SNARK proves that each transaction is valid in its position in the batch.

True, 50x compared to the status quo. Now, batched token transfers are easy to implement, they will soon be widely used. So when deciding which direction to go first with data availability, it’s only fair to compare with them, because any sidechain tx is necesserily a batched tx.

Are there any rough estimates on when EWASM is going to be released? As guys pointed out above, 10x improvement is only possible when using sha3, which requires prohibitatively high number of constraints in the SNARK proof.

Are there any rough estimates on when EWASM is going to be released? As guys pointed out above, 10x improvement is only possible when using sha3, which requires prohibitatively high number of constraints in the SNARK proof.

Maybe a precompile for MIMC is in order?

The security analysis is definitely not complete, but it is simple enough that a wide parametrization could be specified.

1 Like

Nice! I know ZCash has been waiting for more in depth peer review as well for MiMC, any ETA? What about a precompile for Pedersen in the meantime :D?

1 Like

Maybe a precompile for MIMC is in order?

It’s cheap enough in EVM with a reduced number of rounds that it’s a non-issue at the moment. Although the recommended 160 rounds per input would be more taxing.

However, there may be other algorithms with better security properties which are just as cheap both in-circuit and in-EVM (addmod, mulmod and modexp with a small exponent are all relatively cheap, as long as there aren’t any big loops).

I’m still looking into other candidates, doing more research, and trying to determine specifics of security properties, but more people looking into this would definitely be good - as the ‘cheap in EVM and also cheap in-circuit with a ZKSNARK’ is a tough requirement.

2 Likes

This slightly confused me, did you want to say the main chain will be ensuring data availability for the L2 chains and systems (e.g. Plasma) in future? Can you please elaborate? Thanks!

Read the section on “layer 2 execution engines” here: https://vitalik.ca/general/2018/08/26/layer_1.html

1 Like

I would like to assist with this; is there a GitHub repo for this idea that includes the contract and off-chain code that maintains the branches of tx? Once there is a repo, we can try to run a little MVP and see how it does – probably totally separate from any eth network at first and then paired with ropsten.

2 Likes

I would like to assist with this; is there a GitHub repo for this idea that includes the contract and off-chain code that maintains the branches of tx?

We are working towards a PoC for non-fungible tokens at:

But, there is a lot of interest in making it work with accounts and balances, although that comes with many more edge cases and complexities compared to non-fungible tokens. At the moment the aim seems to be ‘keep it simple, and deliverable’.

Firstly, I want to say I fully support everything you wrote in the post , I’m almost convinced that is the best (if not the only) way for Ethereum to achieve its full potential. Btw, I see these future L2 execution engines as more or less the same thing as Plasma chains, am I right?

However, my understanding is that, in this future setup, L1/main chain should have two functions/purposes:

1. Process disputes/exits
2. Guarantee L2 checkpoints/roots/headers availability and ordering (not the availability and ordering of all the data)

I fail to understand how L1 can ensure availability of all the L2 data (like proposed here or in the roll_up’s current implementation)? One can imagine a future where we have thousands of L2 chains/engines with millions of Tx/s cumulatively, can this all be on the main chain?

Btw, I see these future L2 execution engines as more or less the same thing as Plasma chains, am I right?

No, they’re quite different. An L2 execution engine is basically an alternative way of interpreting data that is published to the main chain, with two-way convertibility between assets inside and outside this alternate universe-inside-the-same-universe enforced by a smart contract. This is not quite an L2 execution engine but it’s fundamentally similar.

Plasma relies on the great majority of data being outside the chain, and also relies on liveness assumptions, exit mechanisms, etc, to ensure some notion of safety in the event that whoever publishes the data commitments to chain starts publishing bad commitments.

There’s two very distinct classes of constructions here, with different properties; I think both are valuable.

2 Likes

Oh, so those will coexist. Interesting, thank for the clarification.

I believe there is a typo somewhere here, it’s not clear what are you explaining (L2 engines or Plasma chains), and it’s really funny because in my head this description fits both.

Oh, I see, the main difference is that L2 engine publishes all the data (all transactions) on the chain (just like in this proposal and in roll_up), while Plasma keeps most of the data off-chain, right?