Accumulators, scalability of UTXO blockchains, and data availability

I didn’t get an email notification for your post. It probably wasn’t successfully posted (as opposed to getting deleted).

I’m a little bit confused by the trustless setup you are proposing. Would you mind editing your post to add details and cleanup the exposition? (I think there are several typos that makes things hard for me to follow.) Below are specific questions.

Is this actually a trustless setup, or are you proposing an N-party trusted setup?

What do you mean by “first primes”? Also, do you really mean M'=2^\frac{n}{2-1}=2^n?

Would you mind pointing to a specific construction/paper? I’m new to All-or-Nothing Disclosure of Secrets protocols.

Is this the same f as defined above, i.e. f(i_a, i_b) = (p_a + p_b) \times (q_a + q_b)?

What is \beta?

What is the B&F division test?

sorry I didn’t mean to accuse anyone most probably a mistake then;

no, it was a mistake:
let M be the smallest product of first primes greater than 2^n and as an efficiency iprovement we can assume M' \approx 2^{n/2-1}

In general:
A specific construction:

no this one is for the ANDOS selling protocol.

see paper above, it’s the “merchandise” you obtained without letting the other party know what you choose.

Above paper has the test in section 3.
Note: The protocol above involves a thirdparty overall, but not for the primary test.

For all these accumulators, every UTXO needs to update it’s witness or proof when a new block is added. This means that every coin holder needs to keep track of the blockchain always, which is not practical at all.

Is there any good solution to this issue? One way I see is that miners could provide witness service for normal users, but it’s not an ideal solution.

1 Like

Is there an implementation available for the RSA accumulators described here?

An interesting possibility for RSA accumulators, or even bilinear accumulators, is to make the proof for STARKs much smaller by reducing the size and complexity of the proof for each item from \theta(\log n) to \theta(1). It doesn’t affect the number of proofs required, only the size of the proofs.

In this case we don’t necessarily care about the trusted third party problem of if the accumulator is updated and referenced by multiple people, all that is necessary is to prove that specific members exist within a set in a way which satisfies the computationally binding difficulty of modifying the set to meet a malicious goal, forging one proof is as difficult as forging all proofs or finding a collision in the underlying hashing algorithm.

e.g. with a Merkle tree, the entropy that selects which items from the computation trace that need to be verified are taken from the tree root, if any one item is changed the set of randomly chosen items will change. The proof of each item must be consistent with every other item, that is - their paths share common hashes and the modification of any one item requires the re-computation of the whole tree, which then changes the items which are selected (this is like, intractable computational difficulty).

Trusted setup doesn’t matter one person is providing a set of their own data to be verified by another person, if there were a Merkle tree which required the creator to be trusted to setup the parameters it wouldn’t affect the computational binding property of pseudo-random item selection.

So, the accumulator and witnesses, for a set of items S, is:

N = P * Q
A = 1
W = []
for _ in S:
    W = [pow(w, _, N) for w in W] + [A]
    A = pow(A, _, N)

Note that for every subsequent item added all previous witnesses need updating so that for any given (Witness,Value) pair you can do assert pow(Witness, Value, N) == A. This makes complexity of creating the accumulator about n \log n, where only 2n items are stored, the computational overhead is greater than that of a Merkle tree, but the memory overhead is the same (or possibly greater, given the larger size of the witness compared to the size of each hash)?

Then to select the items:

X = [] # Deterministically chosen samples of computation trace polynomial
B = A
for _ in range(0, n_samples):
    B = HASH(B)
    i = B % N
    X.append( (S[i], W[i]) )

return (A, N, X)   # Accumulator, modulo and random samples

Is there a mechanism where, if you’re creating an accumulator from a fixed sized batch of items, to reduce the computational cost of updating all previous witnesses at each step, without increasing the memory required (e.g. you can use a cheaper multiply operation, then a single POW at the end, but the aggregate cost is about the same)

Ah… Hold on…

What this doesn’t do is guarantee that any item is the randomly selected item within the set, it has no ordering, without guaranteed ordering there is no security as you can just provide any items you want as proof, as long as the set contains them.

To add ordering you’d require another level of structure, and this… I’m not sure how to achieve.

For accumulators to be more usable they would need strong ordering, with verifiability of ordering using only the witnesses. This would give them similar qualities to a Merkle tree, but is that possible with less data than Merkle tree proofs e.g. (\log n) * m * 32 where m is the number of paths, versus (\log_2N) * m where N is the modulus, assuming only \log_2N bits are required for each witness.

1 Like

This post RSA Accumulators for Plasma Cash history reduction on RSA Accumulators describes that we can build both proofs of membership and proof of non-membership. But the stored elements have to be primes or powers of a prime to avoid interaction.
I think that with this we could store an arbitrary number of ordered values (also non-primes) by adding p_1^{x_1} , ... p_n^{x_n}, with p_1, … p_n being the first n primes and x_1, … x_n being the values to store.
By proving membership of p_i^{x_i} and non-membership of p_i^{x_i+1} we show that x_i is on the i-th place in the set.


I’m a bit late on finding this thread, but have you considered using the new RSA batching techniques to handle this problem? @gakonst has a good simplified writeup on it.

The idea is to replace the need for making batched update data available with a zero-knowledge proof that shows that a sender of a transaction knows the discrete log of their batched exponent, without needing to send the exponent of course.

@vbuterin suggested an n-party setup where each party provides a prime to create a modulus N = p_1 * p_2 * ... * p_n. I thought it would be better to have each party provide a large semiprime s_i = p_{i1} * p_{i2} as its own modulus, and use each modulus as its own accumulator, running in parallel. Here’s my reasoning:


  • No multiparty computation ceremony is needed to multiply the numbers without anyone learning them.

  • Proofs can be verified in parallel.

  • Furthermore, updating an accumulator/verifying a witness requires mod N exponentiation, which requires us to multiply N-bit numbers O(\log a) times (using the repeated squaring method). Using the Karatsuba algorithm, this takes O(\log a)(2048n)^{1.58} time for a product of n 2048 bit primes, but only O(\log a) \cdot n (4096)^{1.58} for n products of pairs of 2048-bit primes, so the latter method is faster by a factor of (n^{0.58})/2^{1.58}. The speedup is (n^{1.58})/2^{1.58} if the multiplications are parallel, and potentially more if the decrease in size of the multiplicands allows us to take advantage of hardware speedups for multiplication.

  • Since the semiprimes are just standard RSA-2048 keys, they don’t have to be created specifically for the purpose of this accumulator. They can come from a variety of sources. For example:

    • we can use famous semiprimes like those from the RSA factoring challenge.

    • We can use expired public keys from defunct organizations as suggested in Bunz’s talk

    • We can generate several large random “nothing-up-my-sleeve” numbers and hope they have large prime factors. As Theorem 1 of the paper on page 262 of this document linked by @JustinDrake proves, a 6144-bit (6KiB) random number has a 0.5 * (\ln^2(3/2) + O(1/(6144\ln2))) \approx 0.08 probability of having at least 2 >2048 bit prime factors (not sure, but it may be possible to improve this guarantee by squeezing). As Justin says, this paper sees the benefit of choosing multiple small numbers over one big number, but as far as I can tell, it misses the fact that it is not necessary to multiply these to get an accumulator. Indeed, efficiency was the main problem with this scheme, so getting the (n^{0.58})/2^{1.58} speedup may be important for this approach.

    • Trusted individuals in the cryptocurrency community can supply public keys they generate.

  • (Minor point) The single prime way is insecure against n-1 parties being dishonest, since n-1 parties can factor their primes out of the modulus to find the last prime, the semiprime setup is secure unless all n parties are dishonest.


  • Obviously, the size of the accumulator and witnesses is doubled.

Are there any problems with this scheme I’ve missed?