Sticking to 8192 signatures per slot post-SSF: how and why

I’m seeing a lot of support for Approaches 1 and 2, but both of those options are pretty significant pivots in Ethereum philosophy. Approach 3 seems to preserve all of the values that we like around low barriers for solo stakers, while economically taking advantage of large staker participation.

There is also a sort of elegance to decoupling staking weight for incentives from staking weight for consensus. This approach will mean the chain is incentivized to grow to as many nodes as possible without creating consensus bottlenecks, but still allows for the simplifications that prompted this discussion.

I think for the most part Ethereum has made the right trade offs to get to where it is now and its philosophy regarding staking decentralization is well placed. Straying too far from that may result in unpalatable changes to consensus which the community might not be vocal about yet feel strongly towards.

24 Likes

Though I am as biased as @OisinKyne when speaking about staking and DVT, I am more inclined toward Approach 3 than any other. Let me elaborate:


Comments on approach 1

I feel there are a few issues with this approach. The main one would be on the philosophical part of:

There should be one-- and preferably only one --obvious way to do it.

We all know Ethereum has more than one execution and consensus client, not because there isn’t an objectively better programming language for building Ethereum (I am not opening this topic) but for resiliency and tolerance to human failures. Above all, philosophically speaking, Ethereum should try to be the most resilient distributed system that humans (and human mistakes) can build. Therefore, anything that is “only one” should be analyzed under heavy scrutiny because it might be the result of convincing ourselves of how good we are.

On the technical part, focusing entirely on staking pools does not solve the accountability problem, but it transfers it to the DVT layer. As we (and others) have faced and discussed @vbuterin, individual attributions of the DVT key shares operation create yet another data availability problem. At that point, accountability will be gone in the “Ethereum layer,” indeed, but it will be a problem for the DVT-powered staking pools that will nevertheless affect both pool participants and Ethereum in general.


Comments on approach 2

Correct me if I am wrong, but the main issue with this approach is that the “heavy” layer would virtually stay in control of the protocol, wouldn’t it? I.e., the heavy layer could unilaterally make decisions over the protocol and force the “light” layer to follow, or else the disagreement of the light layer will halt the network (note: not the disagreement of the heavy chain, because from Ethereum’s point of view, the heavy chain is “cooperating”). Coercion, by itself, holds a lot of power, and it is effectively the same power that validators have today (but at least it is much more distributed and not in the hands of a few significantly-sized clusters).

To me, this approach resembles what Polkadot does with the relay chain (or Tendermint with replicated security). Today we could advocate for 1 light layer, but why not 2, or 3? In the end, effectively, the chain is controlled by the relay/heavy chain because it is the only single point of failure, so how much can it handle?


Comments on approach 3

I don’t have many comments on this one. I like the approach because, in addition to keeping solo staking as it is (given the dragon’s size, this is a big plus, touching as little as possible), it also promotes the inclusion of solo stakers with less than 32 ETH. It will be interesting to see how this approach plays out regarding incentives to become smaller or bigger depending on the staking concentration.


TL;DR

I would suggest focusing on approach 3 for the “core” Ethereum layer and decentralizing further with staking pools and DVT in the application layer (or in L2s) if needed. Ethereum should be robust and straightforward but always work regardless of the dependencies (light chains or staking pools) that are built on top of it.

15 Likes

Echo this sentiment. To me the OP reads like it is leading to #3 (goldilocks parable kind of structure). Preserving one-class system and also preserving true solo staking as an option, it seems like this option has the most desirable properties.

The only downside seems to be protocol complexity. On this point, it’s true that #1 and #2 could add major benefits in protocol simplification, but #3 isn’t significantly more complex from the state of consensus protocol today (AFAIU). To me protocol simplicity is a “perk” but not the motivation so as long as it’s not unreasonably complex, #3 seems like the most obvious approach to pursue.

Finally I’d like to highlight this part

I guess just to say that cryptographers have been incredibly effective over the last few years of providing these “magic bullets” and maybe a few more breakthroughs in signature aggregation make this a non-issue? How sure are we of the infeasibility of sticking with the current approach and flexing our cryptographic muscles to solve the bottlenecks it introduces? I’d like to see the rationale/need for this philosophical pivot further expanded upon.

Thanks!

13 Likes

Clarification of Premises

To keep the discussion fair and fundamental, it’s important to clearify the difference in circumstances between when we started discussing PoS (Casper FFG) and committee-based finality, and the present.

  1. Due to the convenience of liquidity for restaking/LSD tokens, pools have become incentivized towards a winner-takes-all tendency.
  2. The emergence of temporary and inexpensive block space due to Proto-danksharding.

In these changing circumstances, the trade-offs of the proposed philosophical pivot approaches (1), (2), and (3) in VB’s post are:

(1) The protocol becomes very simple and efficient. The number of nodes itself decreases, and the pool management methods may become complicated depending on the situation.

(2) The protocol itself maintains its complexity. The efficiency of block space increases, and censorship resistance is maintained as it is.

(3) The complexity of the protocol itself increases compared to before. The barrier to entry for staking is reduced.

Overall, it seems that the main challenges are improving the situation where the 32 ETH upper (not lower) limit is no longer meaningful, and it’s hindering efficiency in block space and bandwidth, and addressing the current concentration in pools.

Clarification and Resolutions of the Problems

Problem 1: DVT should be made simple, excluding cryptography.

Fundamentally, the ‘go all-in on decentralized staking pools’ approach of philosophical pivot option (1) aims to balance solo staking with large-scale pools. While this may sacrifice liveness, it ensures that safety is always maintained. This characteristic aligns with switchable PoW mining pools. It could be said that this is a direction already proven to work.

One question arises regarding the link mentioned: Are technologies like MPC and SSS really necessary for DVT? In my view, what’s essential is the ability to switch pools, and for that, only the following two components are necessary:

  1. The ability to invalidate a signing key within two blocks.
  2. The inability of pools to discern whether a solo staker is online.

As it’s been discussed a lot in the Ethereum PoS discussion space, the safety of PoW is secure because if someone attempts to control 51%, miners can notice that a fork chain is being mined by a pool and switch pools accordingly. In the current PoS system, since the signing key is delegated to the pool, one might become an attacker against her will and just watch herself being slashed. To change this, the holder of the withdraw key should be able to immediately cancel their signing key if they discover it’s being used for double voting. In implementing this principle, a finality of about 2 slots, rather than SSF, seems preferable.

Once this principle is introduced, as long as pools cannot determine whether solo stakers are online, they would be too fearful to attempt double voting. This would allow solo stakers to go offline. Of course, there’s a possibility that pools might speculatively attempt double voting, but as with the current PoS system, attackers would not recover after being slashed, making it not worthwhile.

Problem 2: Few Solo Stakers

The reason for the lack of solo stakers is their lack of confidence in their network environment, even before considering the risk of slashing. Moreover, an increase in solo stakers on AWS is essentially meaningless from the perspective of decentralization. These issues can be resolved by the specific measures mentioned in the above DVT (problem 1) section, which is to create a state where solo stakers are indistinguishable as being online or offline and are delegating to pools.

An important point is that solo stakers going offline does not guarantee absolute safety from being slashed; it only significantly reduces the likelihood of being slashed. This characteristic is a major difference from the current solo staking situation, where going offline often leads to a high probability of being slashed.

Problem 3: Block Size

One reason for the large block size in PoS is the presence of a bit array indicating whether each signer has signed. (The signatures themselves can be discarded some time after block approval.)

The reasons for this array include:

  1. To prevent rogue key attacks.
  2. For use in slashing.
  3. For reward calculations.

For 1) and 2), like signatures, it should be no problem to discard them after they are sufficiently approved. However, with 3), especially now with Proto-danksharding, it seems possible to balance reward calculation and block space reduction by publicly displaying the Merkle tree for a certain period and then discarding it.

Specifical steps:

  1. Divide the bit array, which flags the signers, into several parts and make them leaves of a Merkle Tree.
  2. Place all the leaves of the Merkle Tree in a blob.
  3. Include only the Merkle Root in the block.

Each person can use a Merkle proof to prove their rewards, and with Recursive ZKP, all can be combined into a single withdrawal transaction. This could potentially reduce the size from 8192 bits to about 256 bits. If any wrong root, the majority of the validators always can ignore the block.

Problem 4: Censorship Resistance

It is discussed in this thread that solo stakers, who produce blocks without going through pools, are key to censorship resistance in the core protocol.

Personally, I believe this is a drawback of stakers not being able to switch pools. Once they can switch as per the above procedure, it simply becomes a matter of stakers not choosing pools that have implemented censorship programs. Generally, addresses that are censored are likely to pay higher fees to get through, and market principles should ensure higher profits for pool operators who do not censor. Therefore, the approach (1) can maintain a degree of censorship resistance.

Problem 5: MEV

This is a problem that wouldn’t be discussed so lightly if it were easy to solve. However, as long as the Rollup Centric Roadmap continues, it seems appropriate to support Layer 2 solutions that aim for Based Rollup in the protocol, if there is an opportunity to do so.

TL;DR of my personal opinion

Adopting Approach 1 (DVT) with switching pools is the easiest way to support solo stakers and to minimize the block size. It actually leaves almost no problem. The pool managers can not use the stake to perform a 51% attack if and only if the withdrawal keys can stop it and the pool can not tell whether or not the withdrawal key holders are online to keep watching the behaviors of the pool. The others are also worth considering, but Approach (1) is what we can call the simplification of PoS.
The matter is how to make the switchable staking pool with the shortest finality. I guess it takes 2 slots.

4 Likes

Great! This has been a major concern of mine with the change.

I think its okay for staking to cost a modest amount of consumer hardware and internet. With the Ethereum on ARM effort, you can build a staking machine for under $1k. With verkle trees, I understand we’ll be able to reduce state size by a meaningful amount as well again. If we can keep the delegated staking ecosystem favouring these social principles of decentralisation and being conscious of where and who they allocate their capital to, we can cause the LSPs to compete on this axis of decentralisation, rather than solely on yield or worse rehypothecation.

Yes I agree with this take, and would echo what @pradavc and @tbrannt have said below around not going ‘all in’ on DV based staking at the expense of motivated individuals being able to participate independently, to allow for the most permissionless long tail to survive (these still can be as distributed validators but ‘solo/indie DVs’, not ones from a big club with firm rules because you’re collectively managing millions of dollars of other people’s money instead of your own.) At Obol we have designed heavily toward these DV clusters being independent of one another and of different flavours and variants, along with different governance and stewardship. Eschewing homogeneity in favour of heterogeneity on as many axes as possible.

To reiterate my view from my OP, I’m generally aligned with @kassandra.eth and @pradavc in the below. With the extra specific suggestion of bringing an SSF “duty” in as the non-canonical source of finality at first, and if we’re happy with it post implementation and it needs no tweaking, then we deprecate Casper. All the while, favouring designs that will allow the long tail to participate in the core protocol as much as feasible (consumer hardware and internet, one-to-two digit eth), and pushing the ‘app layer’ of delegated staking to leverage DVT to bring in a wider audience of participating node operators into their products.

2 Likes

In considering either approach one or two, it’s important to contemplate the migration strategy for the current beacon chain. A simple hard fork might no longer be viable due to extensive changes. This could necessitate another merge-like event, such as an in-flight transfer or the creation of a new proof of stake chain. However, the challenge lies in ensuring that these changes do not disrupt the execution side

1 Like

I actually feel like a series of hard forks might work here!

Most of SSF can be implemented as (i) reducing the epoch length to 3, (ii) changing the rules for get_active_validator_indices, and a few further tweaks to attestation inclusion rules and incentives.

4 Likes

The operators of those large validators will be much more susceptible to government censorship and regulatory pressure. The homestakers today are the ones that can single handedly uphold Ethereum values no matter what. Ten thousand distributed homestakers spread over the entire globe are unstoppable. It would be a gigantic loss for Ethereum. And a massive hit for Ethereum narrative. It would be just another PoS chain. Also there is plenty of people who would actively fight against that, entire communities/dapps exist around homestaking, its ontop a philosophical issue. This is a landmine that I wouldnt touch.

4 Likes

I’m a solo validator index number below 10000 and I 'm proud of it. Withdrawing to stake with some liquid stake protocol always hunt my mind and loosing my validator index number is the only thing that stops me doing it.

1 Like

Btw this should be used as one metric for reputation. Why can’t reputation be a criteria for participating in the committee?

3 Likes

It’s DPoS. The problem of DPoS is already well discussed.
That’s why this part should remain somehow.

Approach 3 sounds interesting and might be cool to explore more.
I did some charts showing the impact of the parameters M and k. Largest validator balance set to 2^{18}, then plugging in the numbers:

  • The higher M, the closer it gets to approach 2.
  • A higher M might increase economic security
    • 2/3 of the validators would be above M

Approach 1 sounds a bit scary - we’d basically depend a lot on tradfi. Reputation-gated sounds scary too. I think this could harm censorship resistance (even with having ILs). We saw at the Kraken example how threatening a company led to all of their validators now engaging in censorship. Same applies to 60% of the relay market. Too fragile to go all in (yet). We’d first need ILs and/or encrypted mempools to make sure it doesn’t backfire.
Approach 2 is very similar to 1 while allowing solo-stakers to engage as the last line of defense. Since solo stakers could easily fork away, this sounds plausible.
Approach 3 sounds like the biggest improvement from the status quo, despite the increased in-protocol complexity.

5 Likes

This sounds like a debate on a performance vs security trade-off for a finality gadget.

A finality gadget that uses 10k signatures gives you good performance (latency) for SSF but you are worried of the reduced security (and accountability).

Note that using random sampling (say a VRF), or (secret) fair sampler, the security of k consecutive agreements should roughly accumulate stake - this is a latency tradeoff that gives more accountability as the block is buried deeper over time.

If that is not enough you can have multiple finality gadgets, a fast one with 10k signatures and a slower one with say 100k (or any other number) of signatures. The 100k one can run every 20 slots and take 20 slots to complete etc. This is somewhat similar to approach 2.

Unlike approach 2, having two gadgets (or three etc) that only differ in the VRF sampling probability makes their relationship rather egalitarian and simple (same code). Moreover different clients (or different transactions) can wait for confirmations from different gadgets.

For example maybe doing a transaction that contains more than 1m eth would prefer to wait for multiple 10k confirmations or for a 100k signature confirmation (or confirmation from all stake…). While a block with total value of 10k eth might be okay with a single 10k consensus confirmation, etc

2 Likes

Lets hope for the best. And we cannot swich pool this was major issue

@vbuterin

Would any of these approaches affect prevrandao in so far as it being accessible? It is already used as a source of entropy for some dapps.

This post urged me to write down what participating in ethereum is for me (and maybe others), this might help aim for the right solution

  • Influence/ participate - Individuals/ small groups that can participate and secure ethereum is a powerful idea.
  • Decentralization - ethereum can’t be “taken down” or manipulated by a single entity
  • Economics - No-one expects to get rich but economics play a major decision driver for many stakers (pooled/ institutions and individuals)

I like approach 1, for obvious reasons, but it needs to consider the above points to make it work.

DVT can be run by home stakers easily, it does have limitations in term of the number of consensus participants. Theoretically 1,000 operators on a cluster can be made to work (similar to committee based approach and BFT limitations).
This means 1,000*4,096 = up to 4M “operators” which seems pretty good.

The above requires better key management and secret sharing, potentially the ability to change validation keys to facilitate cluster set changes without compromising security.

Another DVT benefit is that the individual validator is much harder to compromise. Considering “very large” operators will exist, just by the nature of how staking works, it’s better if they are part of a DVT cluster than running a few 4K ETH validators on their own

Thanks @vbuterin & the community for this amazing thread, as always! :pray:

In my humble opinion, +1 to approach-3 with further decentralization through enshrined staking pools, DVT & light clients over time! This would enhance the decentralization ethos that we all love about Ethereum!

1 Like

I feel like I’m missing something obvious, but how does solution 2 solve the problem?

Isn’t the light layer equivalent to what we have today (and in fact worse because there are no minimum)?

See Signature Merging for Large-Scale Consensus to answer your first question

I’m confused about how the incentive structure would work for the light layer. If there’s no slashing vulnerability, but one does exist for the “heavy” nodes, what would be the incentive for anyone to run heavy? Maybe the staking rewards on the light layer would be lower (or zero), but if they’re non-zero, light staking effectively becomes a risk-free rate on ETH.