Phase 2 pre-spec: cross-shard mechanics

cross-shard

#1

THIS IS A WORK IN PROGRESS!

The goal of this post is to provide a rough outline of what phase 2 might look like, to help make the discussion about state and execution more concrete as well as to give us an idea of the level and types of complexity that would be involved in implementing it. This section focuses on withdrawals from the beacon chain and cross-shard calls (which use the same mechanism); rent is unspecified but note that a hibernation can be implemented as a forced cross-shard call to the same shard.

Topics covered:

  • Addresses
  • Cross-shard receipt creation and inclusion
  • Withdrawing from the beacon chain

Addresses

The Address type is a bytes32, with bytes used as follows:

[1 byte: version number] [2 bytes: shard] [29 bytes: address in shard]

There are many choices to make for address encoding when presented to users; one simple one is: version number - shard ID as number - address as mixed hex, eg. 0-572-0DF5283B84D83637e3E6AAC675cE922d558b296e8B11c43881b3f91484, but there are many options. Note that implementations may choose to treat an address as a struct:

{
    "version": "uint8",
    "shard": "uint16",
    "address_in_shard": "bytes29"
}

Because SSZ encoding for basic tuples is just concatenation, this is equivalent to simply treating Address as a bytes32 with the interpretation given above.

Cross shard receipts

A CrossShardReceipt object, which contains the following fields:

{
    "target": Address,
    "wei_amount": uint128,
    "index": uint64,
    "slot": SlotNumber,
    "calldata": bytes,
    "init_data": InitiationData
}

InitiationData is the following:

{
    'salt': bytes32,
    'code': bytes,
    'storage': bytes,
}

Note that in each shard there are a few “special addresses” relevant at this point:

  • CROSS_SHARD_MESSAGE_SENDER: 0x10 - has two functions:
    • Regular send: accepts as argument (i) target: Address, (ii) calldata. Creates a CrossShardReceipt with arguments: target=target, wei_amount=msg.value, index=self.storage.next_indices[target.shard] (incrementing self.storage.next_indices[target.shard] += 1 after doing this), slot=current_slot, calldata=calldata, init_data=None.
    • Yank: accepts as argument target_shard: ShardNumber. Creates a CrossShardReceipt with target=Address(0, target_shard, msg.sender), wei_amount=get_balance(msg.sender), index=self.storage.next_indices[target.shard] (incrementing self.storage.next_indices[target.shard] += 1 after doing this), slot=current_slot, calldata='',init_data=InitiationData(0, get_code(msg.sender), get_storage(msg.sender)). Deletes the existing msg.sender account.
  • CROSS_SHARD_MESSAGE_RECEIVER: accepts as argument a CrossShardReceipt, a source_shard and a Merkle branch. Checks that the Merkle branch is valid and is rooted in a hash that the shard knows is legitimate for the source_shard, and checks that self.current_used_indices[source_shard][receipt.index] == 0. If the slot is too old, requires additional proofs to check that the proof was not already spent (see Cross-shard receipt and hibernation/waking anti-double-spending foe details). If checks pass, then executes the call specified; if init_data is nonempty and the target does not exist, instantiates it with the given code and storage.

Withdrawal from the beacon chain

A validator that is in the withdrawable state has the ability to withdraw. The block has a withdrawals field that contains a list of all withdrawals that happen in that block, and each withdrawal is a standardized CrossShardReceipt obect.

A CrossShardReceipt created by the beacon chain shard will always have the following arguments:

{
    "address_to": Address(0,
                          dest_shard,
                          hash(salt + hash(init_storage) + hash(code))[3:]),
    "wei_amount": deposit_value,
    "index": state.next_indices[dest_shard],
    "slot": state.slot,
    "calldata": "",
    "init_data": InitiationData(
        "salt": 0,
        "code": init_code,
        "storage": init_storage
    )
}

Where dest_shard, salt, init_storage, init_code are all chosen by the withdrawing validator. state.next_indices[dest_shard] is then incremented by 1. This receipt can then be processed by the CROSS_SHARD_MESSAGE_RECEIVER contract just like any other cross-shard receipt.

Transactions

A transaction object is as follows:

{
    "version": "uint8",
    "gas": "uint64",
    "gas_max_basefee": "uint64",
    "gas_tip": "uint64",
    "call": CrossShardReceipt
}

Executing a transaction is simply processing the call, except a transaction (or generally, any call that it not directly an execution of an actual cross-shard receipt) can only create an account at some address if it has the salt such that target = hash(salt, hash(code) + hash(storage)). Note that a transaction simply specifies a call to an account; it’s up to the account to implement all account security logic.

Not covered:


#2
{
    ...
    "init_data": InitiationData
}

Is there an important reason to keep contract creation data as it’s own field? In the spirit of abstraction, the other option would be having a contract on each shard that accepts calldata with the encoded input (salt, code, storage) - and not have a "init_data" field at all. This also allows contracts to create contracts on other shards using the regular send function in the CROSS_SHARD_MESSAGE_SENDER contract.

On the other hand, if there are benefits to keeping these separate, then it might make sense to type the CrossShardReceipt object more strongly. It seems like "calldata" and "init_data" are mutually exclusive in the current spec - so these might be better as different datatypes.

CrossShardReceipt should probably be renamed - it’s not cross-shard or a receipt. What about BaseCallData?

So, to confirm, the salt is concatenated with the hash of the code and the storage, and this is all hashed? Just wondering about the differening notations.


#3

Yeah, this is possible too. The only challenge is that the cross-shard receipt mechanism needs to be able to create contracts at arbitrary addresses, but transactions should not be able to do that, and so if we create from inside a contract, we would have to pass the information about whether the “ultimate source” of the instruction is a transaction or a cross-shard receipt.

CrossShardReceipt should probably be renamed - it’s not cross-shard or a receipt. What about BaseCallData ?

Sounds good to me!

So, to confirm, the salt is concatenated with the hash of the code and the storage, and this is all hashed? Just wondering about the differening notations.

Yes. Or more precisely, to create a new contract the salt is concatenated with the code and the hash of the initial storage.

Note that this removes the ability to dynamically run init code; if we want we can add that back in, but the reason I don’t have it in at the moment is that it would create further divisions between “creating” via yanking (no init code should be run) and creating an actually new contract.


#4

To clarify, yanking a contract keeps the [1 byte: version number] and [29 bytes: address in shard] the same, but changes the [2 bytes: shard].

Note that this removes the ability to dynamically run init code

Question for any EVM experts out there (I’m not one) - what do we lose by not having dynamically run init code? Are there applications today that rely on having this, that aren’t pathological examples?

As a side note, having the version field represent which VM actually runs this contracts code seems like a nice way of versioning the VM.


#5

To clarify, yanking a contract keeps the [1 byte: version number] and [29 bytes: address in shard] the same, but changes the [2 bytes: shard] .

Correct!

Are there applications today that rely on having this, that aren’t pathological examples?

The easiest actually useful example I can think of is self.contract_creation_time = block.timestamp. So there’s definitely some things that are lost.


#6

How would you handle the cross-shard calls that need to update the states of both sender and receiver from different shards?

For general calls, one has to lock both states before finishing two-phase commit for the call. Otherwise, there are cases people can attack it by not finishing two-phase commit. However, locking both states from two different shards is not that practical


#7

The answer at this point is: that is impossible to do directly, so what you would have to do is yank the sender into the receiving shard, perform the atomic operation, then yank the sender back.

See Cross-shard contract yanking for some more discussion on this (“train and hotel problem” being common local jargon for this sort of thing).


#8

Just read a bit about yanking. It might have security issues beside the performance challenge.

Let’s consider the hotel-and-train problem. One person yanks a reserve contract and wants to use it in another shard. An attacker could yank the reserve contract to a third shard before the yankee could use it.

In general case, there have to be ways to “unyank” as the person yanked might disappear after yanking. Attacker could use “unyank” to attack services.


#9

Let’s consider the hotel-and-train problem. One person yanks a reserve contract and wants to use it in another shard. An attacker could yank the reserve contract to a third shard before the yankee could use it.

That could be solved at application level by having the contract’s yank functionality also reserve the contract for some users for some given amount of time.


#10

That could be solved at application level by having the contract’s yank functionality also reserve the contract for some users for some given amount of time.

That’s a practical solution, but still attackable. Attacker could pre-reserve all rooms using yank without really reserve them in the end. It’s hard to tell if it’s an attack or it’s just network issues so that the transaction for reserving both hotel and train is not committed in time.


#11

Agree!

Though if the reservation period is limited (eg. to 1 epoch) then attacks would not be that big a deal. In general, making a cross-shard transaction system that doesn’t leave any room for wasted half-transactions feels like an NP-hard problem, though with lots of reasonable approximations, hence why my long-term philosophy around all of this is to set up a maximally simple base layer, and allow layer-2 mechanisms to emerge on top of it that implement models with stronger properties.


#12

I tends to think making safe cross-shard transaction is impossible in general (not NP-hard), which is a bit like designing lock-free algorithm in general is impossible with only locks.