Beacon chain Casper FFG RPJ mini-spec


#1

The purpose of this document is to give a “mini-spec” for the beacon chain mechanism for the purpose of security analysis, safety proofs and other academic reasoning, separate from relatively irrelevant implementation details.

Beacon chain stage 1 (no justification, no dynasty changes)

Suppose there is a validator set V = {V_1 ... V_n} (we assume for simplicity that all validators have an equal amount of “stake”), with subsets S_1 .... S_{64} (no guarantee these subsets are disjoint, but we can guarantee |S_i| \ge floor(\frac{|V|}{64})), where |x| refers to set size (ie. the number of validators, or whatever other kind of object, in x). Suppose also that the system generates a random permutation of validator indices, {p_1 ... p_N}.

Note: if an attacker controls less than \frac{1}{3} of the stake, then if |S_i| \ge 892 there is a less than 2^{-80} chance that the attacker controls more than \frac{1}{2} of S_i, and there is a less than 2^{-100} chance that an attacker controls all 64 indices in a given span i_k .... i_{k+63}. We can assume that it is certain that neither of these things will happen (that is, we can assume there exists a substring of validator indices p_{i_1}, p_{i_2} ... with p_{i_{k+1}} - p_{i_k} < 64 and that every S_i is majority honest).

We divide time into slots; if the genesis timestamp of the system is T_0, then slot i consists of the time period [T_0 + 8i, T_0 + 8(i+1)). When slot i begins, validator V_{p_{i\ mod\ N}} is expected to create (“propose”) a block, which contains a pointer to some parent block that they perceive as the “head of the chain”, and includes all of the attestations that they know about that have not yet been included into that chain. After 4 seconds, validators in S_{i\ mod\ 64} are expected to take the newly published block (if it has actually been published) into account, determine what they think is the new “head of the chain” (if all is well, this will generally be the newly published block), and publish a (signed) attestation, [current\_slot, h_1, h_2 .... h_{64}], where h_1 ... h_{64} are the hashes of the ancestors of the head up to 64 slots (if a chain has missing slots between heights a and b, then use the hash of the block at height a for heights a+1 .... b-1), and current\_slot is the current slot number.

The fork choice used is “latest message driven GHOST”. The mechanism is as follows:

  1. Set H to equal the genesis block.
  2. Let M = [M_1 ... M_n] be the most-recent messages (ie. highest slot number messages) of each validator.
  3. Choose the child of H such that the subset of M that attests to either that child or one of its descendants is largest; set H to this child.
  4. Repeat (2) until H is a block with no descendants.

Claims:

  • Safety: assuming the attacker controls less than \frac{1}{3} of V, and selected the portion of V to control before the validators were randomly sorted, the chain will never revert (ie. once a block is part of the canonical chain, it will be part of the canonical chain forever).
  • Incentive-compatibility: assume that there is a reward for including attestations, and for one’s attestation being included in the chain (and this reward is higher if the attestation is included earlier). Proposing blocks and attesting to blocks correctly is incentive-compatible.
  • Randomness fairness: in the long run, the attacker cannot gain by manipulating the randomness

Beacon chain stage 2 (add justification and finalization)

As the chain receives attestations, it keeps track of the total set of validators that attest to each block. The chain keeps track of a variable, last\_justified\_slot, which starts at 0. If, for some block B in the chain, a set of validators V_B attest to it, with |V_B| \ge |V| * \frac{2}{3}, then last\_justified\_slot is increased to the maximum of its previous value and that block’s slot number. Attestations are required to state the last\_justified\_slot in the chain they are attesting to to get included in the chain.

If a span of blocks (in the same chain) with slots s, s+1s+64 (65 slots altogether) all get justified, then the block at slot s is finalized.

We change the fork choice rule above so that instead of starting H from the genesis block, it starts from the justified block with the highest slot number.

We then add two slashing conditions:

  • A validator cannot make two distinct attestations in the same slot
  • A validator cannot make two attestations with slot numbers t1, t2 and last justified slots s1, s2 such that s1 < s2 < t2 < t1

Claims:

  • Safety: once a block becomes finalized, it will always be part of the canonical chain as seen by any node that has downloaded the chain up to the block and the evidence finalizing the block, unless at least a set of validators V_A with |V_A| \ge |V| * \frac{1}{3} violated one of the two slashing conditions (possibly a combination of the two).
  • Plausible liveness: given an “honest” validator set V_H with |V_H| \ge |V| * \frac{2}{3}, V_H by itself can always finalize a new block without violating slashing conditions.

Beacon chain stage 3: adding dynamic validator sets

Every block B comes with a subset of validators S_B, with the following restrictions:

  • Define the dynasty of a block recursively: dynasty(genesis) = 0, generally dynasty(B) = dynasty(parent(B)) except when (i) B's 128th ancestor was finalized (and this fact is known based on what is included in the chain before B) and (ii) a dynasty transition has not taken place within the last 256 ancestors of B, in which case dynasty(B) = dynasty(parent(B)) + 1.
  • Define the local validator set of B as LVS(B) = S_B \cup S_{parent(B)}\ \cup\ ... \ \cup\ S_{parent^{63}(B)}
  • Suppose for two blocks in the chain, B_1 and B_2, dynasty(B_2) - dynasty(B_1) = k. Then, |LVS(B_1)\ \cap\ LVS(B_2)| \ge LVS(B_1) * (1 - \frac{k}{60}) (and likewise wrt LVS(B_2)). That is, at most \frac{1}{60} of the local validator set changes with each dynasty.

We modify the fork choice rule so that the seeking process reaches a block that includes a higher dynasty, it switches to using the latest messages from that dynasty.

Claims:

  • All of the above claims hold, with appropriate replacements of V with LVS(...), except with fault tolerance possibly reduced from \frac{1}{3} to 30%.

Epoch-less Casper FFG liveness/safety argument
Convenience link to Casper+Sharding chain v2.1 spec
Formal Barriers to Longest-Chain Proof-of-Stake Protocols
#2

Sincere thanks for all the work and for posting this.

Sorry for taking up your time with questions, I know you’re incredibly busy right now.

What value does |V| >= 892 refer to exactly? The number of ETH?

Second, say a large economic force, such as EOS or a hostile political entity, bought enough ETH to violate the 1/3 stake assumption - are there any safeguards to protect against this unlikely but not impossible event?


#3

Second, say a large economic force, such as EOS or a hostile political entity, bought enough ETH to violate the 1/3 stake assumption - are there any safeguards to protect against this unlikely but not impossible event?

If they get between 1/3 and 1/2, then we can still have some safety conditional on a synchrony assumption. Beyond 1/2, you would need subjective resolution (ie. an honest minority soft-forks the chain and the market chooses that fork).


#4

I’m assuming the hostile staker’s eth would be burned on the new chain


#5

Yes, exactly correct.


#6

I see what you mean subjective as it could prove to be a (possibly difficult?) social solution to such an attacker spread among multiple small stakes

edit: nvm thought it through more. it would be straightforward. thanks for your responses.


#7

Thanks for your continuous work and postings :slight_smile:

Can I get some explanation on That is, the local validator set only changes by at most 1/60 per dynasty. or the ethresear.ch post link relevant to it?

Less than other, I could hardly understand where the (1-k/60) and 1/60 comes from…


#8

The other spec currently has validator sets changing by at most 1/15 per dynasty (1/30 added, 1/30 removed); this spec changes that fraction to 60, but it’s ultimately just a number that can be adjusted.


#9

Choose the descendant of H such that the highest number of validators attests to H

Do validators break ties with the lowest block hash?


#10

I’d say break ties via client-side randomness. Seems safest in the existing cases where it’s been studied.


#11

Would Vpi mod n always resolve to a single “proposer” for each slot i? If this proposed block is rejected by Si, for whatever reason, what would be the recovery mechanism?


#12

Wait until the next slot?


#13

I have ported the network simulator (https://github.com/ethereum/research/blob/master/clock_disparity/ghost_node.py) to Nim (https://github.com/status-im/nim-beacon-chain/tree/master/beacon_chain/fork_choice_rule)

Port is almost 1:1 but with added types which should make it much easier for static languages as a reference implementation.

I have the following comment on the research implementation:

  • The parentqueue and processed fields require respectively a heterogeneous list that can contains Blocks or Signatures and a heterogeneous dictionary indexed by either block or signature hashes and containing blocks or signatures.
    This requires using type erasure in static languages (either inheritance or tagged unions). I think it would be better to have different fields that separate cleanly blocks and signatures. A dictionary/hashtable that supports indexing hashes of different lengths (one would be a BLS12-384 hash the other a Sha3 hash for example) could be tricky.

Next step is to debug it and then to port it to the v2.1 spec and integrate it in the beacon chain. I would be interested in other implementers thoughts.


#14

Lets say you the block number 1 (number 0 is genesis). Lets say it is signed by 60 validators out of 64.

If 1000 years from now 64 validator keys get compromised, they can sign an alternative block 1 and totally destroy the chain. It is impossible to do the same with PoW. Whatever the math is PoS chains are a bit less secure than PoW.

From this perspective, I think ETH strategy should be different. Calling PoW chain “legacy” is not optimal. What needs to be said, is that there is PoW main chain, which is very secure but very slow, then there is PoS beacon chain which is ALMOST as secure as PoW but faster (say 100 tps) and then there are shards, Plasma and other things yet faster but a bit less secure.

The same with Stage 2 and Stage 3 specs. They are very smart but also complex, so compared to PoW, PoW is more secure just based on simplicity. When I read the original Bitcoin paper, it is relatively easy for me to see security. When I read Stage 2 and Stage 3 specs the analysis becomes very hard, especially having that not all things are discussed such as interplay of network splits and denial of service attacks.

I do not want to say that Stage 2 and Stage 3 specs are bad. One needs to make a confession that PoS is (a little) less secure then PoW, so based on this consideration the PoW main chain needs to survive, it is wrong to call it legacy. And then one can have the great beacon POS chain, and shards and Plasma and you name it. Is is a chain of tradeoffs speed vs secuirty.

I understand that the initial idea of Ethereum was to replace PoW with PoS at some point, but this was several years ago, and all the experience learnt since points out that PoS chains are a bit less secure. IMHO ETH project needs to tweak PoW vs PoS messaging a little bit, otherwise it is heading to an ugly moment where a total switch to PoS will lead to a fork where overwhelmingly all miners will stay with PoW, so after the fork the PoW chain will have more value than the PoS chain.

Another reason which makes a total switch to PoS unfeasible is the brand value. People hold money in Swiss banks because of time-proven security. Similarly, people hold money on PoW because of time-proven security. 10 years of now, the bulk of money will be on PoW. So Ethereum is really shooting itself in a foot by saying it plans to kill the PoW chain.


#15

If 1000 years from now 64 validator keys get compromised, they can sign an alternative block 1 and totally destroy the chain. It is impossible to do the same with PoW. Whatever the math is PoS chains are a bit less secure than PoW.

How would the current validator set in the proposed Beacon chain running GHOST accept a new genesis state at n block height?

One needs to make a confession that PoS is (a little) less secure then PoW

Can you please explain in what context it is “a little less secure” and provide a link to research?


#16

This is why revert limits, and weak subjectivity, are needed: https://ethereum.stackexchange.com/questions/15659/what-is-weak-subjectivity


#17

I’ll also mention that I’ve been lately thinking that pure IMD GHOST is not the best idea because of a different kind of attack, and either switching to LMD or some hybrid may actually work better.

(Credit to Alistair Stewart for pointing out the scenario)

Specifically suppose that this scenario happens:

An attacker can now arbitrarily flip the fork choice between left and right by publishing a single validator signature to yellow and then green and so on back and forth.

The basic problem here is that there is another stability desideratum that earlier analysis missed: if we define the “extent” by which a fork choice rule favors A over B as the minimum number of validators that would need to collude to make it favor B, then we want the property that if the fork choice favors A over B at one point in time, the extent by which it favors A should naturally increase over time, and start increasing immediately. PoW longest chain, PoW GHOST, and PoS longest chain all have this property; PoS LMD GHOST does also but POS IMD GHOST does not, which is what leads to this issue. Switching to LMD (plus justification-favoring) does solve this.


#18

> What are the known drawbacks of LMD in the context of FFG?

Basically, it’s “unstable” under exceptional cases due to the boundary condition.

Suppose that you have chains A and B, where A wins under LMD, but B got 65 signatures in some previous epoch, then there’s some validator with 2% that has withheld signatures for B; then that validator can cause a reorg to B at any time.

Reasons why this is not a big deal:

  • Such a situation itself can only happen in an exceptional circumstance, probably requiring both high network latency and active coordination among many malicious validators; otherwise, how would the fork choice have switched from B to A?
  • Even if it does happen, the fork choice should justify something in the A chain fairly quickly; the only case where it doesn’t is the case where >33% of validators are malicious or go offline.

So the range of situations in which this causes problems but outright 51% attacks are not possible is not that wide.

Also note that the worst case is a single reorg, whereas IMD’s failure mode is a large number of reorgs, potentially one per validator.


#19

It can be a bit worse that that. If we get into an unlikely (but asynchronously reachable) situation, then there is an attack with less than 1/3 of votes that potentially stops anything from being finalised indefinitely. Suppose that in epochful Casper FFG with LMD on the last justified checkpoint and 10% malicious stake that votes late, we somehow have something like this:

A - B
30%- 60%
51% - 20%

Here the bad guys are the only missing votes on the first checkpoint. Now the A chain is winning under LMD with 51% of the vote so it will keep getting votes. But the bad guys can justify B and make that the fork choice at any time. What they can do is time things so that A gets around 60% of the vote at the time of the switch. If they pull this off, then B has the last justified checkpoint and is winning, but they have the power to justify A at any time. So they wait for the next checkpoint on B to have 60% of the vote and repeat.

Now actually any fork choice rule that builds on the last justified checkpoint is vulnerable to this kind of attack. If our situation is:

A -B
30%-70%
60%-30%
30%-60%

Then the bad guys who haven’t voted on the last two checkpoints can force a reorg to A and then later one to B by justifying these checkpoints.

We need to argue that either we never get into this kind of situation or that the timing is hard to pull off repeatedly. Now for epochless FFG, the timing of these kind of attacks can be much easier. If the honest guys in the 1/64 who vote on each block vote faster than a block, then the bad guys can reliably time to within a resolution of 1/64 of the votes. There are a few complications in replicating this attack on epochless FFG, such as the blocks that replace the checkpoints above actually being closer than 64 blocks apart but it should be doable with less than 1/3 of the vote.

So we are stuck for hoping that the set up never happens. I certainly have no idea of a credible attack that would give it.


#20

For either of those setups to be possible, high latency is basically already required. Furthermore, if you can force a (30, 70) (70, 30) situation to happen repeatedly, then that already means that you can prevent justification from happening in two consecutive blocks, which means you can prevent finality indefinitely. FFG never claimed to give guarantees of ability to finalize under such conditions.

That said, it is a very good point that the three-round setup shows that no fork choice rule can prevent scenarios where the attacker can justify the non-canonical chain at any time, as one can construct scenarios where the attacker can justify either chain at any time.

It’s certainly an argument in favor of CBC being the right thing long term!