Commit to pre-state instead of post-state on the executable beacon chain

dankrad · March 3, 2021, 12:56pm

TLDR: By changing the state-root included in the ExecutableData in the Executable beacon chain proposal to the pre-state rather than the post-state, we can get rid of the execution bottleneck in block verification/propagation. This means it is ok for the eth1 payload execution to take several seconds on average, which would degrade the beacon chain in the current (post-state) proposal. This means we can likely increase the block gas limit to 50-100M almost immediately post merge, without any security compromises.

Introduction

In theory, a Proof of Stake based system should be able to accommodate a longer block execution time than a Proof of Work based system, because it does not need to account for “block jitter”: All blocks are exactly equally spaced in time, so it is ok to exploit that time for execution; whereas the jitter makes a Proof of Work system degrade somewhat already when block execution only requires a fraction of the average block time, because some blocks will arrive earlier.

However, the current Executable beacon chain proposal does not really allow us to make use of this freedom: It requires full execution of the Eth1 payload before it can be decided whether a block is valid. Since validity is a precondition for attesting to a block, and attestations are supposed to be created 1/3 block time (4s) after a block is published, this really does not leave much time for Eth1 payload execution; much more than 0.5s-1s on average will be difficult because it will interfere with block propagation. At 20M gas per second target, this does not leave much room for a gas limit increase post-merge.

Committing to pre-state roots rather than post-state roots has been suggested in the past to improve scalability (e.g. see Near Sharding Design, section 3.5). It means that Eth1 payload validation will not be necessary at all to validate the current block; it is only necessary to validate the pre-state of the next block, which comes 12s later and thus leaves a lot of time for execution.

I argue that in the current context, we should max out what we can do on the single-sharded EVM and this proposal can, in my prediction, allow us to increase the gas limit 5-10x within weeks of the merge, so in around 12 months. Having this could easily make or break Ethereum, since it will probably be another year or two from then for the effect of full sharding to be felt in gas prices. The proposal also does not depend on any other scaling solutions being deployed and thus benefits even those applications that cannot make use of rollups in the near future.

Proposal

The executable beacon chain proposal suggests adding the ExecutableData data structure to the beacon state:

class ExecutableData(Container):
    coinbase: bytes20  # Eth1 address that collects txs fees
    state_root: bytes32
    gas_limit: uint64
    gas_used: uint64
    transactions: [Transaction, MAX_TRANSACTIONS]
    receipts_root: bytes32
    logs_bloom: ByteList[LOGS_BLOOM_SIZE]

We need to make one change to this data structure: We will remove the coinbase variable and just immediately use the proposer validator balance for it (this is necessary as we will need to be able to charge it; It would be possible to charge to an Eth1 address as well, but would require that all validators maintain a well funded Eth1 address in order to be able to propose blocks (which is very capital inefficient if you only get to propose a block every few weeks). We would need to add a signature that proves that a validator is the owner of the Eth1 address in question). Further, we will change all variables except for transactions to refer to the state at the end of the execution of the previous ExecutableData block, which we clarify by adding the pre keyword:

class NewExecutableData(Container):
    pre_state_root: bytes32
    pre_gas_limit: uint64
    pre_gas_used: uint64
    pre_receipts_root: bytes32
    pre_logs_bloom: ByteList[LOGS_BLOOM_SIZE]
    transactions: [Transaction, MAX_TRANSACTIONS]

Validating the NewExecutableData means checking that all pre_ variables have the proposed value after the previous block’s NewExecutableData.transactions have been executed.

All tips (non-basefee part of gas) are sent directly to the validator’s (Eth2) balance.

Handling invalid transactions

There are a couple of reasons why a transaction can be invalid in Eth1, and this list may not be exhaustive:

Invalid signature
Invalid Nonce
Not enough balance to pay for gas
Block gas limit exceeded

Note that this is different from runtime errors, e.g. running out of gas: The latter do not make a transaction invalid, they just revert all state changes except for the gas charged. However, for invalid transactions, we cannot even necessarily charge the sender, since they may not have intended for it to be included or may not be responsible for its failure. So these transactions need to be charged to the proposer instead: The EIP1559 BASEFEE times basic transaction cost (21k gas plus charge per byte used).

This means that there is nothing in transactions that can make a block invalid, and thus there is no need to check it before propagating/attesting to a block. Only when building/checking the next block do we need to have executed all transactions.

Note that, except for the block gas limit, all of these can be checked by the proposer quite cheaply without actually executing the transactions. This suggests that some proposers might get away with composing blocks without executing transactions, and just staying safely below the block gas limit will make sure they won’t be out of pocket

Maximum gas limit

There are three limits on how much execution time we can allow with this proposal:

The next proposer needs to be able to assemble their own block – they do need enough time for full execution even if they don’t commit to the post-state, as they don’t want to include any invalid transactions
We cannot use 100% of the available time between two blocks for execution, otherwise it is literally impossible to ever sync with a chain
DOS attacks. But it turns out they are probably more benign overall post-merge, because long execution time will only lead to orphaned beacon blocks, but can still have the same attestation rates; so only throughput will be decreased, but not security.

1 and 2 both lead to a (somewhat aggressive) maximum of targetting a little less than 50% slot time for Eth1 execution. For example, if we did target 5s at 20M gas/s, we could increase the gas limit to 100M gas (50M EIP1559-target), which is a lot more than is possible now. A major downside is that it will obviously make syncing much harder, so it would be essential that good sync protocols are implemented that can yield a state quite close to the tip.

adlerjohn · March 3, 2021, 3:01pm

Doesn’t seem new to me, Tendermint makes use of deferred execution from day 1. For exactly the reason of being able to pipeline block execution during the voting phase. There’s been a lot of discussion from the Tendermint / Cosmos community on this front and the ramifications (i.e. downsides) of deferred execution:

Notably, deferred execution prevents gas refunds as they exist on Ethereum today.

On the topic of increasing the block gas limit, block validation rate must be much higher than production rate not for Nakamoto Consensus, but so that users can full sync. Therefore, changing the consensus protocol does not allow us to decrease this multiplier.

dankrad · March 3, 2021, 3:51pm

Did I claim it was new? Quote: “Committing to pre-state roots rather than post-state roots has been suggested in the past to improve scalability (e.g. see Near Sharding Design, section 3.5).” I haven’t tracked down where this idea first came from.

This is not true in the model I suggested, where the proposer does execute the transactions (they just don’t add the post state root).

Much higher? Why? If it is 50%, as suggested, then you can still sync. I think we should abandon syncing from genesis for most users. Let’s say you can get the current state and it takes 10h to download it, then you will need another 5h to catch up to the head for a total of 15h sync time. Seems fine.

In the stateless model, of course, syncing can be parallelized, so this is even less of a problem.

Anyway, thanks for the link – most of the arguments are irrelevant IMO (off by one errors can be minimized by very explicit naming, as I suggested above); but one that remains is that, since you don’t commit to the post-state, it won’t be available to other beacon blocks. I think that doesn’t matter in our case, actually:

It is (as far as I can think) of minimal or no consequence in the data shard model, where there is no execution on shards
When we have execution on (some) shards, we will most likely remove execution from the beacon chain in its own shard. In this case, we will most likely reconsider the execution model, and from what you say, it may be that post-state is the better model then.

adlerjohn · March 3, 2021, 10:53pm

I’m just shitposting for the lulz.

You still have the issue of how to meter the block gas limit. If the miner doesn’t make a claim on the amount of gas used per transaction, then you can only allow 12.5M blocks based on the sum of the transactions’ gas limits, not the gas used as it is currently. Of course, if you include extra metadata per tx on the amount of gas used…then that’s equivalent to just having a state root.

dankrad · March 3, 2021, 11:20pm

Actually we can do it as follows:

We will first count all the bytes of all transactions in the block, which consume gas according to the number of bytes (currently 16/byte). We store this in GAS_CONSUMED
Then we will start executing transactions from the top, counting each toward the GAS_CONSUMED. If at any point (also in the middle of executing a transaction), GAS_CONSUMED > BLOCK_GAS_LIMIT, then:
- The last transaction is reverted, and the gas it has consumed before it reached the BLOCK_GAS_LIMIT is charged to the proposer
- All remaining transactions are not executed, and their gas according to bytes is charged to the proposer

adlerjohn · March 4, 2021, 3:19pm

Interesting workaround, but it still has an issue: the check for Ethereum data validity must be proposer_balance >= sum(tx_gas_limit) * base_fee. This allows miners to prevent the inclusion of any Ethereum data into Serenity by simply making a block where the sum of the transaction gas limits times the base fee is more than 32 (or whatever). Which is trivial to do by making a bunch of txs with a gas limit of 12.5M but that revert immediately.

djrtwo · March 5, 2021, 5:49pm

I think it’s still simplest to specify and use eth1 coinbase for the collection of such TX fees and not mix layers here. Even if the validator balance is used to pay for invalid TX payloads.

dankrad · March 5, 2021, 8:10pm

You also mix it by giving gains to coinbase and charging losses to the validator balance.

But yeah, there are different options. We can also charge coinbase, but then we have to make sure it’s funded and authenticated.

alonmuroch · March 9, 2021, 9:58am

Prysm (and I’m sure others) rolled out timely attestation where validators get notified immediately when a block is received.
It will be interesting to measure what is the actual avg time passed from slot start until X% of the committee received the block. That could make the max gas more accurate, e.g. blocks might be received on the 1st/ 2nd seconds of the slot rather than the 3rd/ 4th.

mkalinin · September 2, 2021, 11:08am

The proposer/builder separation, namely the Idea 1 outlined in this post introduces a ternary fork choice rule which in addition has the following status Block proposal present but bundle body absent. IMO, this new status can be perfectly combined with this proposal as it helps to get rid of the complexity related to transaction verification. Malformed block body may be deemed absent.

Moreover, gossiping the block body after the beacon block has been gossiped and received by the builder introduces an additional delay which may affect attesters’ votes. Having additional time for executing the payload will help to mitigate this issue.