Torrents and EIP-4444
Introduction
EIP-4444 aims to limit the historical data that Ethereum nodes need to store. This EIP has two main problems that require solutions: Format for history archival and Methods to reliably retrieve history. The client teams have agreed on a common era files format, solving one half of the problem. The second half of the problem, i.e Method to reliably retrieve history will likely not rely on a single solution. Some client teams may rely on the Portal network, some rely on torrents, others might rely on some form of snapshot storage.
Torrents for EIP-4444
Torrents offer us a unique way to distribute this history, torrents as a technology have existed since 2001 and have withstood the test of time. Some client teams, such as Erigon already include a method to sync via torrents that has run in production systems.
In order to make some progress on the Torrent approach of history retrieval, the files would first be required. So an era file export was made on a geth running version v1.14.3
. To explore the initial idea, the torrent approach chose pre-merge data as a target. The merge occurred at block height 15537393, meaning all pre-merge data could be archived by choosing a range of 0 to block 15537393. The era files were then created using the command geth --datadir=/data export-history /data/erafiles 0 15537393
.
Once the era files were created, they were verified using the command era verify roots.txt
, with the source of the roots.txt
file being this. The entire process has been outlined in this PR comment. The verification output was found to be this log message: Verifying Era1 files verified=1896, elapsed=5h21m49.184s
The output era files were then uploaded onto a server and a torrent was created using the software mktorrent
. An updated list of trackers was found using the github repo trackerslist. The trackers chosen were a mix of http/https/udp in order to allow for maximal compatibility. The chunk size of the torrent was chosen to be 64MB, which was the max allowed and recommended value for a torrent of this size.
The result of this process is now a torrent of size 427GB. This torrent can be imported with this magnet link and a torrent client would be able to pull the entire pre-merge history as era files.
Tradeoffs
There are of course some tradeoffs with torrents, as with many of the other EIP-4444 approaches:
- Torrents rely on a robust set of peers to share the data, there is however no way to incentivise or ensure that this data is served by peers
- A torrent client would need to be included in the client releases and some client languages might not have a torrent library
- Torrents would de-facto expect the nodes to also seed the content they leech, this would increase node network requirements if they choose to store history
- The JSON-RPC response needs to take into account that it may not have the data to return a response in case the user decides to not download pre-merge data
Conclusion
A client could potentially include this torrent into their releases and avoid syncing pre-merge data by default, which could then be fetched via torrent if a user requests it (perhaps with a flag similar to --preMergeData=True
). The client could also hardcode the hash of the expected data, ensuring that the data retrieved matches what they expect.
Instructions for re-creating torrent:
- Sync a geth node using the latest release
- Stop the geth node and run
geth --datadir=/data export-history /data/erafiles 0 15537393
to export the data in a folder calleddata/erafiles
(Warning, this will use ~427GB of additional space) - Use the
mktorrent
tool or therutorrent
GUI to create a torrent. Choose the/data/erafiles/
folder as the source for the data. Next, obtain the latest open trackers from this github repository. Choose a healthy mix of udp/http/https trackers and choose the chunk size of the torrent to be 64MB. - The tool should output a
.torrent
file, the GUI will also allow you to copy a magnet link if that is required
Instructions for download and verification of torrent data:
- Download the torrent data with this magnet link and in a torrent client of your choice: link
- Clone the latest release of geth and install the dependencies
- Run
make all
in the geth repository to build theera
binary - Fetch the
roots.txt
file with the command:wget https://gist.githubusercontent.com/lightclient/528b95ffe434ac7dcbca57bff6dd5bd1/raw/fd660cfedb65cd8f133b510c442287dc8a71660f/roots.txt
- Run
era verify roots.txt
in the folder to verify the integrity of the data