Open problem: ideal vector commitment

I do not understand how the pairing works. If both key and value are 28 bits, then key * p(value + 2^{28}) will be larger than 2^{32} so you can not apply the p-function on it again.

You’re right. I was sloppy with the bounds. That’s an error. You have to adjust it. Either you’re able to calculate p(k) for higher k or you have to reduce number of different values or keys. Nevertheless, the overall concept stays the same.

In the section “Distributed accumulator updates” the paper of Boneh et al. says about updating membership witnesses: “Using the RootFactor algorithm this time can be reduced to O(n log(n)) operations or amortized O(log(n)) operations per witness.”

Why are you using the p function twice? Wouldn’t pair=p(key+value*2^{28}) make more efficient use of the available primes?

How far can we stretch the p-function such that it still is practical? Because if we want to store hashes we need at least 120 bits.

Yes. Boneh et al. 's paper looks awesome.

I was also wondering why @vbuterin in Using polynomial commitments to replace state roots
picked polynomial commitments over vector commitment? Is there any trade-off?

BTW, I guess the “ideal” in the title means meeting the 5 properties you @vbuterin list, instead of using “ideal group/ring, etc.” (from Math perspectives)?

Yes, ideal is meant as an adjective :slight_smile:

Indeed, what we’re looking for is vector commitments – it just happens that polynomial commitments, at this stage, seem to make some of the best vector commitments we know of.

The polynomial commitment primitive does have the additional advantage that it would naturally expand to data availability roots.

We recently proposed Hyperproofs to address this open problem by Vitalik.

Hyperproofs can be regarded as an algebraic generalization of Merkle trees that use a hash function based on commitments to multivariate polynomials (see [PST13e]).


  1. Aggregating Hyperproofs is 10 to 100 times faster than SNARK-friendly Merkle trees via Poseidon or Pedersen hashes
  2. Individual Hyperproofs are as large as Merkle proofs
  3. Updating a tree of Hyperproofs is competitive with updating a Poseidon-hashed Merkle tree: 2.6 millisecs / update.
  4. Hyperproofs are homomorphic.

Room for improvement / future work:

  1. Aggregated Hyperproofs are a bit large: 52 KiB
  2. Verifying an aggregated proof is rather slow: 18 seconds
  3. Verifying an individual Hyperproof is slower than a Merkle proof: 11 millisecs
  4. Trusted setup with linear-sized public parameters

Key ideas:

  • Represent a vector \textbf{a} = [a_0,\dots, a_{n-1}], where n=2^\ell, as a multilinear-extension polynomial f(x_\ell, \dots, x_1) such that for all positions i with binary expansion i_\ell, i_{\ell-1},\dots, i_1 we have f(i_\ell, \dots, i_1) = a_i.
  • Commit to this polynomial using PST (multivariate) polynomial commitments: i.e., the generalization of KZG to multivariate polynomials
    • Specifically, the vector commitment is c=g^{f(s_\ell, \dots, s_1)}, where s_\ell, \dots, s_1 are unknown trapdoors. (See paper for details on full public parameters.)
  • The proof for a_i consists of \ell PST commitments to quotient polynomials q_\ell,\dots q_1 such that f = a_i + \sum_{k = 1,\dots,\ell} q_k \cdot (x_k - i_k)
    • q_\ell is computed by dividing f by x_\ell - i_\ell
    • q_{\ell-1} is computed by dividing r_\ell by x_{\ell-1} - i_{\ell-1}, where r_\ell is the remainder from the division above
    • …and so on
    • q_1 is computed by dividing r_2 by x_1-i_1, which yields remainder r_1 = f(i_\ell, \dots, i_1) = a_i.
  • Verifying a proof involves checking f = a_i + \sum_{k = 1,\dots,\ell} q_k \cdot (x_k - i_k) holds, but doing so against the PST commitment to f and to the quotients.
    • This is done via \ell+1 pairings / bilinear maps.
    • Let c be the vector commitment to \textbf{a}
    • Let \pi_i = (w_1,\dots,w_\ell) be the proof for a_i, where w_k's are commitments to the relevant quotient polynomials q_k.
    • Then, a proof is valid iff. the following pairing equation holds:
      • e(c/g^{a_i}, g) \stackrel{?}{=} \prod_{k=1,\dots,\ell} e(w_k, g^{s_k - i_k})
  • Unfortunately, computing all n PST proofs as above is too slow, taking O(n^2) time. Instead, we observe that, if we compute proofs in the canonical order above, then proofs often “intersect” and actually determine a tree, which we call a multilinear tree (MLT). Computing this tree only takes O(n\log{n}) time.
  • To aggregate many proofs \{\pi_i\}_{i\in I}, where \pi_i = (w_{i,\ell},\dots, w_{i,1}), we prove knowledge of w_{i,k}'s that satisfy the pairing equation for each i\in I. For this, we use the recent generalization of Bulletproofs to pairing equations by Bunz et al. [BMM+19]
  • Our MLT is homomorphic: two trees for two vectors \textbf{a} and \textbf{b} can be homomorphically combined into a tree for their sum \textbf{a}+\textbf{b}.
    • This leads to very straightforward proof updates in the MLT, after the vector changes
    • It also has other applications, such as providing unstealability of proofs, a property which can be used to incentive proof computation.