Overlay method for hex -> bin tree conversion

holiman · March 24, 2020, 8:05am

I just want to add a little clarification.

Converting a hexary account-trie to snapshot-format takes around 9 hours. This is IO bound, since we need to iterate the entire trie, including all intermeidate nodes.
Converting a snapshot into a hexary trie hash : takes on the order or 10 minutes (estimate). I would assume a binary trie hash takes roughly the same time. This is so much faster since we really only need to iterate data according to the ‘native’ key ordering on the flat db. It’s also read-only, and since the hasher receives keys in order, we can always collapse “behind us” and not have to build large memory structures.
Converting a snapshot into a hexary/binary trie (whole tree). This has not yet been tested. I would assume it’s on the order of ~10h, mainly output-bound, since we’re writing huge amounts of data to a new database. However, we know in advance roughly how much data we’re about to write, and could thus probably optimize this step quite a lot (so the difference between a first naive implementation and an optimized one could be significant).

At the time when we want to switch over, we should assume that geth-clients already have a snapshot database.

Update – generating the trie hash from main account trie (no storage) took below 10 minutes (hat-tip to @gballet) :

Generated trie hash from snapshot accounts=80438713 elapsed=9m14.613433638s