Optimizing Merkle tree multi-queries

vbuterin · February 14, 2019, 12:01pm

I think I accidentally re-invented this two days ago without seeing this thread

ethereum/research/blob/7db6b87cf8642a8671dd9890909586912a0929c9/mimc_stark/merkle_tree.py#L37


    for p in proof[1:]:
        if index % 2:
            v = blake(p + v)
        else:
            v = blake(v + p)
        index //= 2
    assert v == root
    return int.from_bytes(proof[0], 'big') if output_as_int else proof[0]


# Make a compressed proof for multiple indices
def mk_multi_branch(tree, indices):
    # Branches we are outputting
    output = []
    # Elements in the tree we can get from the branches themselves
    calculable_indices = {}
    for i in indices:
        new_branch = mk_branch(tree, i)
        index = len(tree) // 2 + i
        calculable_indices[index] = True
        for j in range(1, len(new_branch)):
            calculable_indices[index ^ 1] = True

Your version does seem to have more compact code, though some of my version’s complexity has to do with making the proof generation algorithm not technically take O(N) time (the for i in range(2**depth - 1, 0, -1): loop and the space complexity of the known array).

This reduced the size of the MIMC STARK from ~210kb to ~170kb and removed the need for the ugly wrapper compression algorithm.

Also a possibly dumb question:

        # The merkle root has tree index 1
        if index == 1:
return hash == root

Doesn’t this make the verification return true if the first branch is correct, regardless of whether or not subsequent branches are correct? Or is it guaranteed that the “merging” via queue = queue[1:] will bring the checking down to one node by the time it hits the leaf?