RSA Hash accumulator for arbitrary values


RSA accumulators can efficiently store primes, but are (as far as I know) not efficient with non-prime numbers. My goal is to store arbitrary values, just like you can do in a Merkle tree, but having a shorter proof size. This can have multiple applications, such as in zk-starks, Plasma, etc.

I believe it is possible that we can proof that multiple values are contained in the accumulator with a single 2048 bit witness.

Basic example for three values

Let a, b and c be the values we like to store.
Let \textit{h} be a secure hash function.
Let N be a 2048 bit RSA modulus with unknown factorization and g be the generator.
Let p, q and r be three distinct large primes.

The accumulator A = g^{qr \textit{h}(a)+pr \textit{h}(b)+pq \textit{h}(c)} \mod N
To proof that a is stored in the first spot we need a witness W = g^{r \textit{h}(b)+q \textit{h}(c)} \mod N
The verifier has to check that g^{qr \textit{h}(a)}W^{p} \equiv A\mod N

To proof a fake value a' is in the accumulator, the forger needs to calculate the p-root of g^{qr (\textit{h}(a)-\textit{h}(a'))+pr \textit{h}(b)+pq \textit{h}(c)}\mod N. I believe this problem to be computationally infeasible when the RSA trapdoor is unknown.

It is also possible to make a single proof that the accumulator contains multiple values. For example when we want to proof a and b are both stored in the accumulator, we need the witness W = g^{\textit{h}(c)} \mod N
The verifier has to check that g^{qr \textit{h}(a)+pr \textit{h}(b)}W^{pq} \equiv A\mod N

Hash accumulator with n primes
We can generalize this to an accumulator for n values using n primes.

Let x_{1},..,x_{n} be the n values we like to store.
Let p_{1},..,p_{n} be n distinct large primes.
We define P_S to be g to the power of the product of all the primes not contained in the set S modulo N
P_S = g^{\prod\limits_{\substack{k=1,k\not\in S}}^n p_{k}} \mod N

The accumulator A = \prod\limits_{k=1}^n P_{k}^{\textit{h}(x_k)} \mod N
To proof x_i is stored in spot i we need a witness W = \prod\limits_{k=1, k\neq i} ^n P_{i,k}^{\textit{h}(x_k)} \mod N
The verifier has to check that P_i^{\textit{h}(x_i)} W^{p_i} \equiv A \mod N

To proof that multiple values from set B are in the accumulator we need a single witness W = \prod\limits_{k=1, k\not\in B} ^n P_{B,k}^{\textit{h}(x_k)} \mod N
And the verifier has to check that \prod\limits_{k\in B} P_{k}^{\textit{h}(x_k)} W^{\prod\limits_{k\in B}p_k} \equiv A \mod N


Doesn’t this require quadratic overhead? If so, that’s a non-starter for most applications where prover time is already a bottleneck…


Since RSA accumulators can efficiently store primes, couldn’t you use this fact to efficiently store prime factorizations of non-prime numbers? Obviously, you would need a way to check the prime factorization and depending on the size of the number, this could have a lot of overhead.


why don’t just use a more advanced form of accumulator like the ajtai hash?

Randomly sample A from \mathbb{Z_p}^{n \times m} then do Ax\, mod\, p.
This is not only quasy-commutative but also quantumsafe, parallelizable and accumulations are trivially revocable by accumulating the inverse element.
Note that parameter selection for n and m are sensitive and limits your input size and security


You can build a witness in O(n) time.
Pseudo code:

function calculateWitness(pos,values) {
   var W=1;
   var P=g;
   for (var i=0;i<values.length;i++) {
      if (pos!=i) {
   return W;


Could you perhaps elaborate on how using the Ajtai hash would help in this case?


I have a similar question to @gakonst - the RSA accumulator scheme is defined using a function f : \mathbb{Z}_N \times \mathbb{N} \to \mathbb{Z}_N given by f(a, x) = a^x \pmod N; then the quasi-commutativity [1] property is f(f(a, x_1), x_2) = f(f(a, x_2), x_1).

From what I can find online [2] the Ajtai hash function is defined by g: \{0, 1\}^m \to \mathbb{Z}_q^n given by g(x) = A x \pmod p. What’s the quasi-commutativity property for g?




A burden for the verifier is that it needs to store all the primes and all P_k constants, to verify the proof.

For a light verifier, we can save on storage space, when the prover supplies both P_k and p_k. The verifier can check P_k^{p_k} \equiv P_{\emptyset} \mod N, where P_{\emptyset} = g^{\prod\limits_{k=0}^n p_{k}}. This way the light verifier only has to store P_{\emptyset}.

For a bulk proof the prover can provide P_B and all primes corresponding the the set B. The verifier can check the correctness of P_B, by checking P_B^{\prod\limits_{k\in B} p_{k}} \equiv P_{\emptyset} \mod N and calculate all needed P_k.

The disadvantage of this trick is that the verifier can only know that a value x is one of the n values in the accumulator, but not the specific spot.

For applications that do not need the specific spot, this light verifier trick can be used.

I was wondering whether zk-starks proofs (for which we can shrink the proof a lot by having a smaller merkle tree replacement) actually need the spot, or that putting the spot in the hashed data is sufficient. From what I understand of zk-stark proofs, it would not weaken the proof as long as the number of values in the accumulator is limited to n. A forger could try to put multiple values in the same spot to increase the chance to be able to give the correct answer for that spot. But now the forger increases the chance of not being able to answer when an empty spot is chosen.


Yes, I have thought of that, but it would require more primes and a larger proof.

To store a 256 bit hash value we could use 256 primes each representing one bit. Now we have to proof inclusion for on average 128 primes and exclusion for the other 128 primes. We can efficiently combine the inclusion proofs, but not the exclusion proofs, because we have a remainder the size of the product of 128 primes. For primes of 64 bit size, this results in a witness size of 2*2048 +128*64 =12288 bits.
A trick we could do is, instead of having one bit per prime, we could have 8 bits per prime, by including p^b where b is a byte. We have to prove p^b is included, but b^{n+1} is not.
Now we need only 32 primes for which we do 32 inclusion and 32 exclusion proofs. The witness size is now down to 2*2048 +32*64 =6144 bits for one value. For an extra value we need an extra 32*64=2048 bits, because we have a bigger remainder.
Unfortunately we cannot overstretch this trick, because calculating g^{p^{x}} \mod N is not trivial for large x.

My proposal is based on one inclusion proof, so is only 2048 bits for an unlimited number of values.


let X = {x_1, x_2, .... x_n} arbitrary values. then your accumulator is \Sigma_{i=1}^{n}Ax_i \,(mod \,p). Witness is trivial, same as with RSA and revocation is just acc + x_i^{-1} where the inverse is by mod \, p

Security of the scheme is proven to be equivalent to solving the Shortest Integer Solution problem on ideal lattices.



look up the properties of vector spaces :slight_smile: