A local-node-favoring delta to the scaling roadmap

MicahZoltu · May 19, 2025, 1:28pm

I would love to see state expiry implemented! I spent something like 6 months trying to solve it with some Ethereum researchers a while back but we were never able to come up with a solution that we thought would be satisfactory enough to actually get implemented. If someone has figured out a solution I would love to hear it!

egk10 · May 19, 2025, 2:48pm

What if you pay validators a fee per MB to encourage investment in storage. The more storage a validator has the more rewards it gets.

vbuterin · May 19, 2025, 4:13pm

Why? There’s no requirement for FOCIL validators to accept transactions according to any specific formula; making a non-empty FOCIL list is a purely volunteer task. So they can use whatever rules they want, and then keep the state that is necessary for them to be able to enforce those rules.

vbuterin · May 19, 2025, 4:14pm

I increasingly think the solution to state expiry is to just move closer and closer to a world where every node is a partial state storing node. Users what want to send txs that use state that other people don’t store would have to locally store the associated witnesses and attach those witnesses to their transactions.

SCBuergel · May 19, 2025, 4:31pm

I appreciate the effort towards increased resilience here again. But I’m not sure if what is proposed is actually realistic.

What I would like to see play out is end-users are able to run their own RPC servers that follow head and store the vast majority of all interesting state on Ethereum.

Yes, that aligns with censorship resistance, but I doubt it’s the realistic outcome. What would that look like in practice? Ideally, every wallet would run a local reduced-state node - think “Mist, but scalable.” But that shifts an even heavier burden onto full nodes, which would then have to:

Lookup light-client proof data → increased storage (read) load
Generate the light-client proof → increased compute load
Relay the light-client proof over the network → increased bandwidth and connection load

I’d love to see performance estimates, but depending on rollout, you could easily end up with partial-state nodes outnumbering full nodes by more than 1,000x (assuming one wallet per reduced-state node). That alone could bottleneck full nodes. And network load isn’t just about bandwidth - it’s also about how many concurrent connections a node can handle.

With all that extra work, why would full-node operators participate? We’ve managed so far with only a few light nodes freeloading on existing full nodes, but that dynamic would break once most Ethereum users run light clients.

Maybe I’m off base here - it all depends on how this is rolled out and adopted. I’m curious what @vbuterin envisions in practice, since that will determine what additional systems - like a dedicated light-client proof-data layer - we might need to design, build, and incentivize.

soispoke · May 19, 2025, 4:41pm

If FOCIL includers don’t store (or query, or verify a proof) the validity-only state (balance and nonce basically), they can’t know which transactions should be kept or pruned in the mempool, which means an attacker can easily spam the mempool with a lot of invalid transactions, that would then end up in ILs and crowd out valid ones.

In practice they can definitely choose not to hold state if they don’t want to, but then they wouldn’t be able to participate in FOCIL and propose “good”, relevant ILs. If we don’t assume all nodes will want to participate in FOCIL (which is fair, it’s just a departure from the model we have today where all participating nodes running an EL client maintain the public mempool according to the exact same rules), then FOCIL should probably opt-in, and FOCIL nodes should be cleanly separated from other nodes performing other duties (e.g., Attester-Includer Separation) in the future. But then they might need incentives with either fees, or issuance. This is a topic we’re actively exploring but it’s very tricky find a simple, incentive compatible solution there.

MedardDuffy · May 19, 2025, 7:24pm

For persistent storage to survive in a decentralized system, the erasure code needs to be composable and decentralized, i.e. RLNC. One-time use codes like RS will lead to loss of ability to recover the data because nodes come and leave.

F.H.P. Fitzek, Toth, T., Szabados, A., Pedersen, M.V., Lucani, D.E., Sipos, M., Charaf, H., and Médard, M., “Implementation and Performance Evaluation of Distributed Cloud Storage Solutions using Random Linear Network Coding," IEEE CoCoNet 2014

V. Abdrashitov and Médard, M., “Durable Network Coded Distributed Storage," Allerton 2015

V. Abdrashitov and Médard, M., “Staying Alive - Network Coding for Data Persistence in Volatile Networks," invited paper, Asilomar 2016.