A meta-execution environment for cross-shard ETH transfers

See Moving ETH between shards: the problem statement for the problem this is trying to solve.

We can create an execution environment, which we’ll call “the fee environment”, whose primary role will be storing ETH balances held by other execution environments. This execution environment will contain a minimal balance-holding and cross-shard receipt system, allowing ETH to be moved between shards quickly.

Prerequisites

We assume that a block contains (i) data, and (ii) a list of execution environments that access the data. The EEs in the list are called in sequence, and all have access to the entire data (before we had each EE only have access to its own part of data, but there’s no cost to not allowing each EE to have access to the entire block data). EE execution also has access to “auxiliary outputs” generated by EEs that have already been processed.

We assume block producers all “understand” the fee environment, in the sense that they understand when they receive coins in it.

Expected block structure

Every block will contain one or more segments that are each dedicated to an execution environment, containing transactions inside of that execution environment. The execution environments themselves make an auxiliary output representing a summary of the cross-shard transfers they want to make; multiple transfers to the same (shard, EE) pair are batched.

At the end of a block, there is a segment dedicated to the fee environment, which includes a Merkle multiproof of all the EEs for which there existed segments (note that those prior EEs themselves would peek into this segment and verify its correctness and the sufficiency of balances [the sufficiency check would require EEs after the first to also peek into auxiliary outputs of previous EEs]). The fee environment would issue a state root that contains the Merkle tree with the updated balances within the shard, as well as a receipt root that contains all of the cross-shard transfers. The receipt root would be designed so that the path encodes a shard+EE combo, and the leaf encodes the amount to be transferred.

For example, if a block on shard A contains a single transaction, [Alice ---{50 ETH}---> Bob] where Bob is on another shard B, there would be two receipts, (i) the EE-specific receipt that allows Bob to claim the ETH, and (ii) the EE would generate an auxiliary output, {B, EE_id, 50}, and then the fee environment would reduce the balance of {A, EE_id} by 50, publish the updated root as its new state root, and make a receipt tree with a single element, {key=(B, EE_id), value=50}. If there are N transactions in the same EE, only N+1 branches would be produced; the overhead is only constant.

Receiving receipts

Now, let us extend the above scheme to allow receipts to also be received. The fee environment’s state now contains a third root, a bitfield root, a Merkle root of a bitfield, storing for each shard and each slot whether or not a receipt from that slot has already been processed. For efficiency, we will likely want to put the bits corresponding to the same slot beside each other, so we only need one Merkle branch in the “happy case” where receipts are processed immediately.

The section in the block dedicated to the fee environment would also declare which branches of this bitfield it reveals. For every branch it reveals where the bit is a 0, the Merkle proof of the pre-balances would also need to reveal the corresponding EE balance, and the fee environment would also need to provide a Merkle proof of the corresponding receipt. The updated pre-balance root would increase the EE balance by the value in the receipt, and the bit would be flipped to a 1.

Meanwhile, the execution environments themselves would enforce a rule that accepting the receipt from Alice to Bob is only valid if the corresponding Merkle branch showing that the EE-level ETH transfer has completed is in the fee environment segment. This prevents Bob from receiving the ETH inside the EE without the execution environment on that shard receiving the ETH. Note that sometimes receipts with the same start and destination shard created in the same slot would be received in different slots. In this case, a proof to the fee EE bitfield showing inclusion of the receipt in slot n would have to be included again, and in all inclusions after the first the bitfield would already be set to 1 so the EE-level balance increment would not happen multiple times.

Fees and cashing out

Execution environments would also be able to, in their auxiliary outputs, specify how much ETH they are paying as a transaction fee. Validators are also able to have accounts in the fee environment, and can collect this ETH. The fee environment additionally allows validators to withdraw their ETH.

Overhead analysis

Assuming a relatively full and active chain, with ~3 EEs being executed per block, and with every shard sending transactions to every other shard, we can expect the following overhead:

  • Balance proof: 3 branches, assuming 2^{16} EEs that’s ~15 * 3 = 45 hashes or 1440 bytes
  • Bitfields: assume that, on average, all receipts are claimed within 5 slots. This would require a proof of all shards in the last 5 slots; these bits are contiguous, so this would only be 320 bytes for the bitfield segment + a single 1024 byte branch = 1356 bytes. A single receipt N slots behind would add log(N) * 32 bytes.
  • Transfer value proofs: assuming 2^{16} EEs and 64 shards, that’s a length-22 Merkle branch to represent a (shard, EE) key. There are 64 of these, so we get 22 * 64 hashes or 22528 bytes. Note that we can try to be clever and put commonly-used EEs higher up in the tree, cutting this by more than half.

Hence, in total, ~10-25 kB of each shard block would be filled with proofs associated with the fee environment.

Update

See comment for how to replace bitfields with nonces.

5 Likes

I was thinking about using a rollup for this. The rollup would also contain a simple state transition logic to manage different “remote” calls (success/failure). The user can prelock the maximum gas on the rollup (like in what we proposed in Trustless and secure cross zk-Rollup transfer protocol) and pay once the real cost of the call is known. Exploring this is in our todo list.

The problem with making it a rollup is that then everything that depends on it would also be dependent on the rollup’s optimistic state, and so whenever the rollup reverts due to a fraud proof, everything else would have to revert as well. So you would not be able to make non-optimistic applications.

Or do you mean a ZK rollup? In that case I would be concerned with making ZK-SNARKs and all their assumptions, in addition to ZK-SNARK proving infrastructure, essentially mandatory at the base layer…

1 Like

Have you thought about using atomic swaps for this? I think it might be possible to prevent the griefing factor by relying on the beacon chain’s randomness. But, the tradeoff would be the increase communication costs.

Yes, I meant a zk rollup. Agreed, it can’t be layer 1. However, the intuition (yet to explore…) is that it allows to have a slow or minimalist layer 1 if a layer 2 can do the job efficiently.

Have you thought about using atomic swaps for this?

Reliance on an atomic swap layer is a feature of the old crosslink-once-per-epoch model that we are trying to get away from :grinning:

1 Like

These conversations mainly describe the portion of absorbing funds into the appropriate EE for cross shard transactions and are not user/account centric. On the user/account side, you’d still need to show your EE specific receipt and track its own bitfield.

Questions:

  • This model basically replaces the “operating system” model you stated in Eth 2 Shard Simplification Proposal. It skews to the same behavior, but instead does not enshrine it into the consensus layer, putting it into an EE that validators/block producers understand?
  • How does this affect the fee market for transactions? If I’m the first user to want to use the funds from one of the cross shard calls and absorb it into my account within the EE, then I’d be responsible for the fee/proof for the receipt/bit and also for my individual account and transaction. I’d have to use more gas than subsequent users who can take advantage of the already revealed bit/inclusion.
  • Should the claim segment be dedicated at the end or beginning of the block?
  • This requires all EEs to understand this meta-execution environment (and the claim receipts/send)?
  • How might this apply to other assets?
  • I still wonder if there might be a model in between that takes advantage of the VHEE/Generic Asset EE discussed? I see that they sort of fundamentally solve different things, however, a generic asset EE that is recognized and EEs standardize around could actually make it less necessary to have balances within a lot of the EEs and just have them communicate/utilize this. It feels like we’re starting to move in this direction, but not there fully. hmmm, curious your thoughts.
  • In this model, EEs would still have to understand cross-EE receipts. For example, EE1 would need to read the generated receipt from EE2 to absorb the balance into the user’s account. If we start veering more to a model in the point above, we could actually avoid this complexity. Thoughts?

Finally, I think the most interesting question would be to know what are the main benefits of putting this into an EE instead of part of consensus layer? What are the advantages/cons of this approach?

Also, what are the advantages of having shard-balances vs. global EE balances (this approach is new to the Eth 2 Shard Simplification Proposal)? If most of the cross shard activity is within 1 EE, wouldn’t it be more simple to have global EE balances (vs. shard specific)?

The main idea here seems to be a receipt bitfield. A bitfield to prevent double-spending of receipts was part of the plan in the original phase 2 proposal 2.

More recently, Implementing cross-shard transactions suggested using receipt nonces instead of a bitfield. The receipt nonce idea was also mentioned in the problem statement in solution #5 (#5 “removes the need for nontrivial amounts of state to store which receipts have been claimed and which have not; we just have a ‘next unclaimed receipt ID” ticker’”).

This seems to combine a receipt bitfield with solution #4 (“Enshrine one EE for ETH / asset holdings that everyone is required to understand”).

Was there an issue with using receipt nonces, or why did you go back to proposing a bitfield?


Nomenclature nitpick: “meta-execution environment” adds more confusion than clarity imo, I don’t see any way to distinguish it from the protocol (ditto for “enshrined EE”); e.g. everywhere it said “fee environment” I read “shard state” (as in, validator balances and EE balances are fields in the shard state). Maybe I see “enshrined meta-EE” as a fancy term for the protocol because I define EE’s as user-deployed wasm code (this definition of an EE versus the protocol is one way to arrive at an answer to “who can deploy EE’s?”).

The main challenge with nonces is that with nonces you can force other people to include your branches, which requires a more complicated gas model to deal with. This more complicated gas model must be part of the base protocol. With bitfields, you can keep the “sender pays for the receiver side” approach.

Maybe nonces can be made to work, as there’s a maximum of one receipt per shard per slot so there’s no risk of overload, but there would still be complexity…

To illustrate my fear more specifically, consider a case where transfers from shard A to shard B for some specific (A, B) pair almost never happen for whatever reason. Some bad actor makes 1000 transfers from shard A to shard B in 1000 different slots for 0.0000000000001 ETH each, and never bothers to claim the receipts. Shard B now has a backlog of 1000 proofs that no one bothers to include. Then, someone makes a legit transfer from shard A to shard B. They are now personally stuck paying gas for 1001 proofs.

OK, I think I cracked how to do it nonce-based.

Basically, every shard remembers a “last nonce sent” and “last nonce received” for every other shard. To send a receipt from shard A to shard B, if shard[A].last_nonce_sent_to[B] = n you need to provide a receipt from shard B proving that shard[B].last_nonce_received_from[A] >= n-2. So the sent and received branches have to be almost perfectly in sync, and if someone skips receiving then the next person to send has to cover for them.

I think this design is simpler and lets us get rid of bitfields.

Yeah I agree the terminology can be improved. Any of this stuff can just as easily be enshrined in-protocol; it’s just that doing it at EE level makes it more upgradeable (you don’t need a hard fork, you can just create a new thing and have users slowly migrate to the new thing over time).

Also, I came up with something even simpler today:

I guess if it is an EE, meta-EE would still not be a great name. To me it has the connotation that it is something like an EE that itself contains EEs. But rather, it is a very simple EE that everyone agrees is responsible for some basic payment infrastructure. “EE balance EE”? “Fee payment EE”?

Vitalik is describing a system that is transferring value between EEs (EE level shard balance). It still does not define how to absorb value into a user account or for a fee payment. That would require a separate bitfield. If it’s between two EEs, ee1 & ee2, ee1 would still need to understand ee2’s receipts. This piece is orthogonal.

This flexibility (what is proposed here) is solid, but also I am advocating for a VHEE or Fee Payment EE to reduce the use of this infrastructure. EEs can communicate with it synchronously and only need to understand it. This means ee1 & ee2 only need to understand the VHEE or fee payment EE. They don’t need to understand each other (which is great since otherwise you would need to define a way for ee1 and ee2 to communicate at the account/user level. This circumvents that). Without this, a new ee1 couldn’t communicate with a new ee that is deployed after it.

I will write a post on this in a few days to be clearer - but the proposal that is written here, does not define how to absorb value into a user account or fee payment - that is an orthogonal problem and I believe we can simplify all of this by defining a fee payment/VHEE.

does not define how to absorb value into a user account or fee payment

Ah yes, I forgot to define the fee payment part. The simplest thing to do would be to just have balances for every validator on every shard, and treat fee payments the same as cross-EE transfers within the same shard.

I think you can do cross-shard cross-EE transfers by just combining them into a cross-shard intra-EE transfer and a intra-shard cross-EE transfer. So on shard A you make a receipt in EE1, and then on shard B you claim the receipt in EE1 and then you do an atomic swap between EE1 and EE2 within the same transaction.