One concept that I think will probably apply well across any number of different methods for enumerating the state is how to package up and transmit the data for that state.
This approach depends on the witness spec.
For a given key prefix, we need a standard way to package up the data under that prefix. We assume that we know nothing about how much data is under the prefix meaning that this mechanism needs to allow for chunking of the data. We also need to be sure that each chunk of the data is individually provable.
My current thinking is that it would be done roughly like this: For any tree of data that a client wishes to transmit to a peer, we define a rough maximum number of trie nodes that the witness can contain. This is similar to the GetNodeData
cap of 384 trie nodes. This provides an out-of-protocol way to ensure that the packets don’t get too large to transmit. So for a given witness we would have the following values:
state_root
: the state root under which this data is associatedprefix
: the path into the tree that the witness is forproof
: the actual witness
I’m currently not sure that the
prefix/state_root
differentiation is necessary but it doesn’t really matter that much at this point. We do however need to ensure we have a way to anchor the data that is being transmitted to a state root.
So a “seeder” who is transmitting data to a peer would build the witness and then chunk it into some set of pieces. These pieces would then be transmitted in order to peers. It also seems that we’d want some way of linking all of the chunks for a witness into the same bucket so that a peer can know how many total chunks there are so they can gauge their progress and definitively know they are done syncing a given section of the trie.
Another thing to think about under this model is whether we can define an algorithm for chunking such that two seeders transmitting data about the same prefix
under the same state_root
always produce the same chunk packets.