Execution & Consensus Client Bootnodes

At the core of Ethereum should be a censorship-resistant design. I think we all agree on this. Thus, it’s also important to think about the unimaginable (even though we’ve globally seen already such censoring events in certain countries or similar ones like GitHub with Tornado Cash). It will be very difficult to execute such an extreme event, but it remains a possibility (coordination across a couple of countries is not unusual…). That’s why I think it’s important to think about a game plan and document it properly in case such extreme (or even less severe) events take place so operators/participants know how to act accordingly. Where could you potentially find new uncensored bootnodes? How can I configure it? etc.

That’s why I’m convinced that it’s so important to have globally diversified DCs for the bootnodes to circumvent supranational but still local censoring attacks. Peers can always be censored via ISPs so it’s important to have globally distributed bootnodes available to preserve the censorship-resistance core value and resilience of Ethereum as a whole.

1 Like

Sounds like we need better docs and better visibility on the hosting options available. What else do you think we need to do here?

Ideally we don’t need a DC but for boot nodes I understand why putting these on someone’s laptop in a basement is probably not the most robust long-term solution (well unless a lot of people do that but that’s not reasonable to assume as likely)

The most important part IMHO is to create awareness about the issue. We can’t fully resolve this challenge but we have to make the network participants conscious of this.

The problem with cloud solutions is the following: most of them a running under US law, so in case we even diversify on the cloud providers for the bootnodes, a single point of failure exists: the US enforcement possibility. One option is that the EF is setting up in at least 20 countries around the globe independent bootnodes running each on country-specific (i.e. local) DCs. Another additional option could be that the EF maintains an official bootnode list (including the hosting details) from which each of the clients pulls the information and you can be also added as an individual there after being carefully vetted - i.e. trying to include the broader community. We would need to think about an incentive scheme (& slashing possibility) there of course. Maybe others have other ideas…

3 Likes

Obviously it would be nice to have support, but in the absence of that I don’t see why we can’t do some of the stuff ourselves? “Ourselves” being community members, sysadmin-types, devops people, Core devs who have time to comment, and enlist folks from the various home staker & solo staker communities — would involve some degree of cat herding yes

fully agreed: how can we best create more awareness about this important discussion @timbeiko, @MicahZoltu? Cc: @vbuterin

2 Likes

Hey Guys,

@randomishwalk sent this thread to me over twitter and I am jumping here as I can potentially help with distributed global infrastructure

latitude.sh, bare metal company I operate, run in 15 locations (9 of them being our of the US) - Global regions to deploy dedicated servers and custom projects - Latitude.sh

Happy to chat more

1 Like

Bootnodes are nice, they are something of a UX helper, making it easier for a brand new node to find peers.
If they are beefy enough, they can also help serve eth data; state and blocks. Because of the amount of traffic (both egress and ingress), they are also pretty costly.

Are they terribly important to the network? In my opinion, no. Geth works fine without them, since more stable peers can be found via the dns discovery . The dns discovery is also centralized as in “it’s collected and published by EF”, but the information that is published is self-signed by the nodes themselves (in the form of ENR records).

Also, nodes (at least geth) remembers information about peers from previous runs.

In general though: If EF-controlled bootnodes are seens as ‘critical infrastructure’ then we should remove them, because the network needs to get by without central points of failure.

3 Likes

If everything works as intended, I would agree. But they can become pivotal in an extreme censoring event. And my threat model tells me that in the current state of the world we should think about this possibility.

that’s exactly the point - the overall preferred solution should be to have a situation where we could completely remove the current kind of centralised initial trusted setup. However, since DNS discovery is also centralised (for anyone interested, see here for the list) and can be affected by an extreme censoring event, I think having globally distributed, EF-independent bootnodes serving as last resort rescue is the best solution.

UPDATE (14 April 2023)

TL;DR: 4 EF Azure bootnodes got removed since my original post. Now mostly dependent on AWS and Hetzner (silently screaming inside!).

Overview Execution Clients

Go-Ethereum

Nethermind

Erigon

Besu

2 Likes

I love that this discussion is coming back around. I am not convinced that it is as much of an issue as it used to be (because devp2p improvements happened like better DNS discovery so bootnodes aren’t even necessarily needed). At the same time, I’d love to see a solution for a decentralized way to have bootnodes that don’t also increase the risk of bad actors compromising the alternative bootnodes.

I co-lead the DevOps team at the EF from 2016-2021 (and had a 3 year stint in the middle of that as orgsec lead after Martin Swende). I won’t explain the entire security and set up for the bootnodes for security reasons, but I would find it super unlikely that the bootnodes could be compromised via a hack (so a hacker changes the geth bootnode to a bad geth node that makes a split chain) or that a sustained dos attack could happen to the nodes (because the EF would be able to respond and mitigate any attacks or at worst rebuild the whole thing in under half an hour not counting sync time).

I’m definitely open to hearing solutions, but I’m not convinced adding more entities besides the EF is the right way considering the low risk and the potential to open more possibilities for exploit of the bootnodes.

2 Likes

My issue here is that I don’t have transparency about why, for example, the EF is able to act as you claim. Security through opacity doesn’t work well, and I understand that you can’t disclose all information for security reasons either. The current situation is like: please trust the EF that we’re doing our job properly. I’m not saying this is not the case, but the required information is (at least publicly) not available. Also, peers can always be censored via ISPs so it’s important to have globally distributed bootnodes available to preserve the censorship-resistance core value and resilience of Ethereum as a whole.

3 Likes

I agree with everything you are saying, but I don’t think the risk is high enough for the ones responsible for the boot nodes to act compared to other pressing issues that would affect Ethereum at the protocol/safety level more. Note: I’m not really involved in that deeply anymore so it’s up to them, I’m just relaying what I suspect they will react.

At the same time that shouldn’t mean we ditch your ideas because they do help.

One idea: start asking individual client teams to set up bootnodes that are geographically diverse and with a common set of standards that include what you propose (geographically diverse, bare metal, etc.). It shouldn’t be the responsibility of the EF to do this entirely and adding bootnodes once the other client teams create them is as simple as a PR on each client. I think other EL client teams would be very open to this and may already have testing infra that can be converted to supporting their own bootnodes as well.

1 Like

I think that’s a great idea and somewhat of a natural choice given existing devops expertise.

Yet another option, which seems to be more the case in the MEV-Boost relay space, are independent, non-client team affiliated infrastructure operators (Agnostic and USM being two examples on the MEV-boost relay side). Experienced folks from the ethstaker community, for example, might be one natural fit for something like this.

I think that’s a good idea, and similar to what I have been thinking since the beginning of this thread. The reason why I haven’t reached out so far is twofold:

  • I first wanted to gather various ideas in this thread and decide on the action plan,
  • Understand how such an action plan can be efficiently coordinated, since I don’t want to end up with a situation where Geth implements a couple of bootnodes, Erigon & co. don’t and just re-use the Geth bootnodes (as done currently). Any ideas on how to approach this best?

100%.

Great point - anyone has a direct line to @superphiz?

I think you need to explain what “extreme censoring event” means concretely. My Erigon client has 35Mb database of previously seen peers. I find it hard to conceive of a situation where it cannot reconnect to the network using at least one peer in that file after a reboot and the network isn’t already broken anyway.

For example, a supranationally coordinated censorship attack via ISPs. Let’s say your 35MB database consists of European and US peers and the attack is launched via Europe and the US, then you have a problem. Or there is a massive DDoS attack that prevents your peers from helping you resync. I think it’s important to emphasise that we need to build a censorship-resistant infrastructure that is future-proof, which means we also need to be prepared for the unimaginable. The argument “If we are in such an extreme situation, the world has bigger problems anyway” is not satisfactory. It is not only about the current situation but about all possible future scenarios (even if they are unlikely). I’d rather spend some time thinking about an appropriate solution now than regret in a decade that we did nothing.

I would argue this isn’t really a problem. A large portion of the discovery process includes EIP-1459, which is Node Discovery via DNS. What happens is that a crawler crawls through discoverable nodes on the network and at a regular cadence updates the domain ethdisco . net. The data can then be used to find a “dynamic” list of peers via DNS records. The raw data dump can be found on the github repo ethereum/discv4-dns-lists as mentioned in some of your older posts.

So in a scenario in which all nodes listed about on centralized providers are taken out, the network will still be up and function via the DNS based discovery. Assuming that Ethereum is actively being censored everywhere and DNS discovery isn’t enough, every user is more than welcome to add in peers that can be shared publically through the --bootnodes or similar flag present in every client, these can be shared by various forums or methods to end users. Assuming DNS is purely being censored for specific domains such as ethdisco, then users can opt to switch DNS providers or run their own recursive revolvers. More entities are also welcome to setup their DNS records in a similar manner to ethdisco, docs on the topic can be found here: DNS Discovery Setup Guide | go-ethereum

Additinally any sort of active attack on the discovery/bootnode layer of things will not break the network immediately. It’ll purely break new nodes wanting to join the network or restarted nodes, the network will continue to function as expected for already peered nodes. This would also imply we have some time to react in such a scenario.

1 Like

So in a scenario in which all nodes listed about on centralized providers are taken out, the network will still be up and function via the DNS based discovery.

I think our threat models are different here. Any attacker capable of compromising all of the boot nodes I feel like would be capable of compromising a single DNS address (likely as many as they want). I suppose this becomes less true as we reduce down to a single service provider (e.g., AWS), but even compromising both Hetzner and AWS seems harder than compromising DNS?

3 Likes

EthStaker would be happy to become a bootnode.

We are already running a public checkpoint sync endpoint and we could easily add the configuration to use that node as another bootnode.

I can be a contact point for this. Simply contact me on Discord (Remy Roy#1837) or Twitter (remy_roy) for private DMs.

5 Likes

Very happy to hear this! May I ask exactly how this node is currently hosted (i.e. bare metal, location, etc.)?

I think we should have a similar page as Ethereum Beacon Chain checkpoint sync endpoints for the bootnodes used by the clients, and it’s possible to add further community bootnodes (in a separate section) that folks can use via --bootnodes or similar flag (this would require some people maintain the GitHub repo for this page to approve/dismiss PRs; I would volunteer for that for sure, but at least 2-3 approvals needed for each PR that adds an additional community-based bootnode link).

I was wondering whether there is someone here who could support me to add at least someone from Geth, Nethermind, Erigon, Besu, Lighthouse, Lodestar, Nimbus, Prysm and Teku team to this thread so hopefully each of the EL/CL client teams could serve at least on bootnode itself. Maybe @Souptacular?

1 Like