The Fusaka fork will introduce 1D, column-based data availability sampling, where each sample is a column made up of cells and proofs from all blobs. After Fusaka, improving the propagation of columns is one of the main avenues to further increase the blob throughput, and the tools proposed to do so have so far been moving from push-based gossip to an announce-and-pull/push hybrid, and introducing some form of coding, like RS or RLNC. In this post, we explore various ways to do the latter, while also making individual cells be the basic unit of propagation. Thanks to @asn, @dankrad, @potuz, @MarcoPolo , @vbuterin for discussion and feedback.
Goals
- Efficient propagation. Apply some form of erasure coding to column propagation, to unlock a better curve in the latency-bandwidth tradeoff space.
- Cell-centric gossip. Make cells be the unit of propagation:
- lets nodes use mempool blobs to contribute to propagation even when they only have a (non contiguous) subset of them (e.g. even when pulling a single blob)
- possibly lets us eventually move to 1 message = 1 network packet. For example with 256 columns, 1 cell = 1 KB, and fits into one packet even with its KZG proof.
Note that 2. is a fairly strong constraint. If we just wanted to improve the CL propagation path, without concerning ourselves with the interaction with the EL mempool and distributed blob publishing, this would simplify things significantly for some of the designs, because we could work with fewer, larger chunks. However, the following exploration takes this constraint as a given.
In the rest of the doc, we’ll by default assume 256 columns and 128 blobs total, unless otherwise specified.
Overview of Approaches
The designs we explore primarily differ along two axes:
- Encoding Scheme: How redundancy is introduced. We always take a cell to be the fundamental unit of the encoding scheme, and consider two possibilities, both of which consists of a linear combination of the cells in a column:
- Reed-Solomon (RS) coding: Applying standard erasure coding vertically to columns (similar to how it’s already used horizontally for rows in EIP-7594).
- Random Linear Network Coding (RLNC): random linear combinations of the original cells in a column.
- Gossip Verification Mechanism: How nodes ensure the integrity and authenticity of the chunks they receive and forward during gossip, before reconstruction. For example, we could break a
DataColumnSidecar
up in chunks and encode it, but it would only be verifiable once the whole message is reconstructed, whereas we need each chunk to be independently verifiable. We ensure verifiability in one of the following ways:- KZG Proof Verification: Each cell, including the coded ones, carries its KZG proof, which is verified during gossip. KZG proofs for coded cells are also linear combinations of the original proofs, verified against linear combinations of the original commitments, taking advantage of the linearity of KZG.
- Cell root verification: The block commits to all cell roots, for both the original and coded ones. Lists of cell roots for each column are distributed in the respective subnets, and verified against the commitment in the block. Gossip verification involves a simple check against the relevant list. This approach does not allow recoding during propagation, because we can only verify the cells whose roots were committed to and distributed.
- Pedersen: The proposer uses Pedersen vector commitments to commit to the base cells, and linear combinations are verified homomorphically against these. It’s the approach from this post, adapted to column propagation.
RS
At a high level, we vertically extend the columns through RS encoding (the same used for the horizontal extension), and gossip individual cells. We consider two ways of doing so, which differ in the way they ensure integrity of the transmitted chunks:
- RS + KZG: cells come with kzg proofs, which are verified before forwarding
- RS + cell roots: the publisher commits to the cells in a simple way, e.g., sending lists of cell roots in the relevant subnets, and committing to these in the block. Forwarding just requires checking against these roots, not proof verification.
The second option has much lower computational cost for the publisher, which does not need to do the expensive computation of extension proofs.
RS + KZG
Each DataColumnSidecar
subnet is replaced by a CellSidecar
subnet, where all cells corresponding to the vertically extended column are gossiped. In addition, we have a single, global HeaderSidecar
subnet, to gossip the signed header with all kzg commitments (and an inclusion proof). The HeaderSidecar
topic would have a high-degree mesh, e.g. 30, to ensure that it propagates as fast as possible. We can accept a high duplication factor here, because the message is only a few KBs (most of the size is due to the kzg_commitments
, ~6 KBs for 128 blobs)
class CellSidecar(Container)
cell: Cell
row_index: RowIndex
column_index: ColumnIndex
kzg_proof: KZGProof
class HeaderSidecar(Container):
signed_block_header: SignedBeaconBlockHeader
kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]
CellSidecar topic (subnet)
cell_sidecar_{subnet_id}
- Reject if
subnet_id
does not matchcompute_subnet_for_cell_sidecar(column_index)
CellSidecar
verification is put on halt until aBeaconBlock
or aHeaderSidecar
has been received.- Once a
kzg_commitment
forrow_index
has been obtained, reject if kzg proof verification fails
While waiting for a matching BeaconBlock
orHeaderSidecar
, neither message delivery nor forwarding can happen. Therefore, whenever a peer sends us a cell that we cannot yet verify, we request a HeaderSidecar
from them (these are quite small, and we can cap the number of parallel requests we send out). We could also downscore a peer that doesn’t respond to this request after a timeout, to punish sending unverifiable cells.
Once a HeaderSidecar
has been received, all kzg proof verifications can be scheduled flexibly, to maximize the benefits of batched verification.
HeaderSidecar topic (global)
header_sidecar
- Ignore if it is from a future slot or not from a slot greater than the latest finalized slot
- Reject if inclusion proof is not valid
- Reject if the header’s signature is not valid
- Reject if the current finalized checkpoint is not an ancestor of
block_header.parent_root
- Ignore if it is not the first sidecar for the tuple
(block_header.slot, block_header.proposer_index, sidecar.index)
with valid header signature and sidecar inclusion proof - Reject if the proposer is not the expected one for
block_header.slot
in the context of the current shuffling, defined byblock_header.parent_root
.
KZG commitments for extension cells
The HeaderSidecar
only contains the kzg commitments for the original cells. Verifying the extension cells requires us to interpolate the original commitments, i.e., compute C' = M_{RS} \cdot C where C = (C_1, \dots, C_{128}), the original commitments, and M_{RS} is the RS matrix. This requires 128 (one per commitment) 128-point MSMs in G1 , around ~130ms multi-threaded (~1ms per 128-point MSM on zka.lc). Note that the computation only has to be done once per node, when first receiving the HeaderSidecar
.
We could in principle avoid repeating the computation at every node. kzg_commitments
in the HeaderSidecar
could contain all commitments, including the extension ones. These would need to be verified for consistency with the original ones, but that can be done faster than recomputation, by choosing a random scalar vector r and checking r^TC' = (r^T M_{RS}) \cdot C. This only requires two 128-point MSMs in G1, ~2ms. The tradeoff is of course with bandwidth, as this would mean another ~6 KBs in the HeaderSidecar
.
Computational cost
The most expensive operation is the interpolation of the original proofs to compute the extension proofs. For each column, this is the same exact operation we discussed for the extension of KZG commitments, i.e., 128 (one per proof) 128-point MSMs in G1, to compute \pi_j' = M_{RS} \cdot \pi_j where \pi_j = (\pi_{1j}, \dots, \pi_{128j}), the vector of original proofs for column j. The proposer has to do this for every column, in the critical path, or in other words it has to do one 128-point MSM per proof. This seems prohibitively expensive, since it has to compute 128 \times 256 = 32768 proofs. Using the numbers from zka.lc, it would be ~34s already multi-threaded. A further ~3x speedup could come from replacing multiplication by M_{RS} with 2 FFTs, but this still leaves us far from an acceptable cost. Note moreover that, even if we abandoned the idea of fitting a CellSidecar
in a single packet, we can’t go any lower than 128 columns, which would only halve the cost.
Removing such a high computational cost at the publishing step is precisely why proof computation for the original blobs has been moved out of the critical path by outsourcing it to tx senders. On the other hand, proof computation for the vertical extension can neither be outsourced nor precomputed as we get blobs from the mempool, because we need all blobs that will go in the block to do it.
RS + cell roots
Let’s consider a different approach, which avoids the high cost of computing the extension proofs, by using a much simpler way ensure integrity of the cells. The erasure coding is limited to the cells, and accordingly gossip validation does not require KZG proof verification (not even for the original cells). Instead, it relies on checking against a list of cell_roots
that is sent sent separately in the same topic, as part of a ColumnSidecar
. In turn, this is verified against a list of column sidecar roots, included in the HeaderSidecar
and committed to in the block.
# Column topic
# Without a kzg proof
class CellSidecar(Container)
cell: Cell
row_index: RowIndex
column_index: ColumnIndex
# New sidecar with commitments to all cells and carrying the proofs
class ColumnSidecar(Container):
column_index: ColumnIndex
cell_roots: List[Root, 2 * MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
# Global topics
# Adding a commitment to the column sidecars
class HeaderSidecar(Container):
signed_block_header: SignedBeaconBlockHeader
column_sidecar_roots: List[Root, NUMBER_OF_COLUMNS]
column_sidecar_roots_inclusion_proof: Vector[Bytes32, COLUMN_SIDECAR_ROOTS_INCLUSION_PROOF_DEPTH]
kzg_commitments: List[KZGCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK]
kzg_commitments_inclusion_proof: Vector[Bytes32, KZG_COMMITMENTS_INCLUSION_PROOF_DEPTH]
class BeaconBlockBody(Container):
...
# need a better name
column_sidecar_roots_root: Root
As before, we propagate a HeaderSidecar
in a global topic. In addition to its previous content, it now contains a list of roots, column_sidecar_roots
, plus an inclusion proofcolumn_sidecar_roots_inclusion_proof
against header.body_root
. The BeaconBlockBody
itself only contains a commitment to thecolumn_sidecar_roots
, i.e., column_sidecar_roots_root = hash_tree_root(column_sidecar_roots)
.
In the subnet for a given column, we propagate:
column_sidecar
, containingcell_roots
, committing to all cells for the given column, including those for the extension cells, and moreover containing allkzg_proofs
. For gossip validation, we checkhash_tree_root(column_sidecar) == header_sidecar.column_sidecar_roots[column_sidecar.column_index]
. Note that thekzg_proofs
are not verified as part of gossip validation.cell_sidecar
corresponding to this column. For gossip validation, we check thatcolumn_sidecar.cell_roots[cell_sidecar.row_index] == hash_tree_root(cell_sidecar.cell)
.
Some notes:
- The publisher does not need to do any heavy computation, just simple erasure coding of the cells (not in the group) and hashing.
- Since we are verifying the cell sidecars against
cell_roots
(itself in turn verified againstcolumn_sidecar_roots
), we don’t need to verify akzg_proof
as part of forwarding, so kzg verification does not slow down propagation. - We do need to verify all kzg proofs for a column before considering it available (and attesting). The kzg proofs are contained in the
ColumnSidecar
, and can be verified as soon as the respective cells are also received. Since kzg proof verification is decoupled from forwarding, this can be done while maximally taking advantage of batching. - Just like in the RS + KZG scheme, the interaction with
getBlobs
is ideal: when we retrieve a blob from the mempool, we get a fullCellSidecar
for each column, which we can immediately propagate - All benefits from the RS + KZG scheme apply here as well.
Proposer without the blob data
In RS + KZG case, as well as in the Fusaka protocol, a proposer that somehow knows the blobs associated to a list of blob txs are available, without having downloaded them, could just include those txs, and the respective KZG commitments, and entirely rely on distributed blob publishing to propagate the actual blob data on the CL (whether cells or data column sidecars). This could for example be the case if a committee attests to the availability of mempool txs.
On the other hand, this protocol does not allow the proposer to do so, because having the cells (or to be precise the cell_roots
) and the kzg_proofs
is required in order to create the column_sidecar
s, which are themselves needed in order for cell_sidecar
propagation to happen. Note that the column_sidecar
s have to also be committed to in the block, so the proposer really needs them at block creation time.
RLNC
Here we consider two ways to adopt this proposal to our goals for column propagation. In both cases, we use cell-level chunks, and the main difference is in how we verify chunks during propagation: one option is to reuse KZG proofs, the other to stick with Pedersen commitments as in the post.
RLNC + KZG
We can leverage the linearity property of KZG commitments and proofs to enable RLNC for column propagation, while maintaining the verifiability of the recombined chunks tied back to the original KZG row commitments:
- Chunks correspond to cells, due to our initial goal of cell-centric gossip. They travel with and are verified via kzg proofs, so that doing a random linear combination of chunks necessitates also doing the same linear combination of the associated proofs, to produce the proof for the recombined chunk.
- Verifying a chunk necessitates computing the corresponding KZG commitment, which is the linear combination of the original commitments, with the given scalars.
Protocol
- Base Chunks: The fundamental unit (chunk) of RLNC is a cell, and the original cells are the base chunks that we want to transmit. Let v_{ij} denote the data payload (field elements) of the cell at row i and column j. Associated with v_{ij} is its KZG proof \pi_{ij} (proving the evaluation against the row commitment C_i).
- Coded Chunks & Representation: Any chunk v propagated via RLNC represents a linear combination of the base chunks for that column j, v = \sum_i s_i v_{ij}. Each chunk v is transmitted along with its corresponding aggregated KZG proof \pi = \sum_i s_i \pi_{ij} and the vector of scalars s = (s_i)_i defining the combination. For the base chunk v_{ij}, the scalar vector s is simply given by s_k = 1 if k=i and 0 otherwise.
- RLNC Encoding (Re-coding): A node possessing a set of input chunks (v_k, \pi_k, s_k)_k for column j (where each s_k = (s_{ki})_i so v_k = \sum_i s_{ki} v_{ij} and \pi_k = \sum_i s_{ki} \pi_{ij}) can create a new coded chunk. It selects a set of random scalars (r_k)_k and computes:
- Coded Data: v = \sum_k r_k v_k.
- New Scalars: The new vector of scalars s' = (s'_i)_i such that v = \sum_i s_i' v_{ij}. For this, s'_i = \sum_k r_k s_{ki}.
- Coded Proof: \pi = \sum_k r_k \pi_k. (Note that this is equivalent to \pi = \sum_i s'_i \pi_{ij}).
- RLNC Packet: The node gossips a packet containing (v, \pi, s'). To fit the existing sidecar structure, this can be represented as a
CellSidecar
wherecell
holds v,kzg_proof
holds \pi, and the newrlnc_scalars
field holds s. Thecolumn_index
is retained, butrow_index
is repleced byrlnc_scalars
, since the packet represents a combination of rows.class CellSidecar(Container): cell: Cell column_index: ColumnIndex kzg_proof: KZGProof rlnc_scalars: List[Bytes32, MAX_BLOB_COMMITMENTS_PER_BLOCK]
- Verification: A node receives (v, \pi, s) for column j
- It has the original row commitments C_i from the
HeaderSidecar
(same as in the RS + KZG approach). - It computes the combined commitment: C = \sum_i s_i C_i.
- It verifies the KZG proof \pi against the coded data v and the combined commitment C, using the evaluation points corresponding to j (
column_index
).
- It has the original row commitments C_i from the
Scalar Size and Overhead
A concern with RLNC is the overhead of transmitting the scalar vector s = (s_i)_i (rlnc_scalars)
. Let N be the number of base chunks (original rows/cells per column, e.g., 128).
- Full-Size Scalars: If s_i are full 32-byte scalars, the overhead is 32N bytes, i.e. 128 \times 32 = 4096 bytes, prohibitively high.
- Small Scalars: When recombining chunks, we could always choose new scalars r_k from a smaller range (e.g., 1 byte, 0-255). This still gives us a high probability of sending a useful chunk (if at all possible), \ge 1 - 1/256 \approx 99.6%. Initial overhead is low (e.g., N \times 1 = 128 bytes with 1-byte scalars).
- Scalar Blow-up: Re-coding computes s'_i = \sum_k r_k s_{ki}. If s_{ki} requires B_s bits and r_k requires B_b bits, s'_i requires roughly B_s + B_r + \log_2(m) bits, where m is the number of terms combined. Bit length grows roughly linearly with the number of re-coding hops (h). If starting with 1-byte scalars (B=8) and assuming propagation needs h=4 hops, scalars might need roughly (h+1)B \approx 5 bytes (40 bits) at the last hop. For N =128, total overhead at the last hop would be 128 \times 5 = 640 bytes.
Fighting the scalar blow-up
We could also experiment with different strategies to reduce the blow-up, for example:
- Only the publisher recodes, in order to have initial redundancy but always keep the same 1-byte scalars. In other words, we would be using the random linear coding purely as a replacement for erasure coding, with the only benefit being that we can trade-off the FFT optimizations of RS encoding and its stronger guarantees (knowing that any 50% of the chunks is guaranteed to be able to reconstruct) for using smaller scalars. In particular, we get:
- 32x lower computational load for the publisher. As we have seen for RS + KZG and as we discuss in the next section, the load for publisher is a huge problem of these schemes
- The small scalars used by the proposer can be agreed upon in advance. Again, we’re just replacing the RS code matrix with another matrix of 1-byte scalars. Then, the combined commitments can be computed just once, as soon as the
HeaderSidecar
with the original commitments is received. Moreover, since the code is known in advance, we don’t need to include the scalars in the messages, which avoids the 128 B of overhead per cell.
- Only recode at least M chunks, for some M to figure out experimentally. The idea here would be to directly forward chunks in the earlier phases of propagation, when duplicates are less likely, and to only do recoding when it is most useful (somewhat related to push-pull phase transition). That way, the first few hops don’t blow-up the scalars at all. Note also that part of the reason to do recoding is that it lets you avoid needing to choose which chunks to prioritize when you’re bandwidth constrained: if you are not bandwidth constrained, you could always just send all the chunks you have to every peer, but if you are bandwidth constrained it is best to send as many recoded chunks are you are able, to maximize the chances of the chunks you send being useful to the receiver. When you don’t yet have many chunks and you can forward all of them (and presumably the receiver downlink is similarly not saturated), recoding isn’t as useful.
Computational Cost
- Sender: To combine m received packets (v_k, \pi_k, s_k) using coefficients r_k, the bulk of the cost is computing \pi = \sum r_k \pi_k, which requires one m-term G1 MSM. At worst m = 128. As we’ve already seen in the RS + KZG case, this takes ~1ms on zka.lc, already multi-threaded. This is for full scalars. With 1-byte scalars, the cost becomes 32x smaller, and close to negligible.
- Receiver: To verify a received packet (v, \pi, s) where s has m non-zero entries, the bulk of the cost is:
- Compute C = \sum_i s_i C_i, which again requires one m-term G1 MSM (~1ms)
- Verify the kzg proof for (v, \pi, C), requiring 2 pairings (~1.5ms)
Note that, even if we use 1-byte scalars, the s_i here are not freshly generated scalars, and are thus subject to blow-up, whereas the linear combination of proofs done by the sender directly uses the newly generated small scalars, and thus always maintains the same speedup. However even with some blow-up, the pairings would dominate the cost here.
- Proposer: To generate R coded chunks for a column, the proposer incurs the worst case cost of the sender of a packet R times, i.e., R 128-term G1 MSMs. It needs to this for each column, so 256 times. To tolerate a loss of ~50% of the chunks, we need R = 256, so the cost is 256 \times (128\text{-term MSM}) per column. Alternatively, the proposer could publish the original chunks (base cells v_{ij} with their proofs \pi_{ij}) as well, and only generate R = 128 coded chunks (or less, depending on what the desired redundancy is).
Using the latter strategy, the cost for the proposer is the same as in RS + KZG, which makes sense since we’re essentially also erasure coding the block with a 1/2 rate, but with a random code. Unlike RS + KZG, we cannot even benefit from FFTs since the code is not structured that way. However, there’s a big advantage if we use 1-byte scalars instead of full (32-byte) scalars, as this should result in a roughly ~32x speedup in the MSMs, bringing the cost down to ~1s.
RLNC + Pedersen
Let’s now discuss the approach from the RLNC ethresearch post, which uses Pedersen commitments to guarantee integrity of the recoded chunks. A lot of what follows is just re-stating what’s in the post, for the specific case of propagating columns, and reasoning through the concrete overheads.
Protocol
- Base Chunks & Commitments: The base chunk is still a cell v_{ij}, seen as a vector of M=32 field elements (v_{ijk})_k (for 256 columns). The proposer computes a Pedersen Vector Commitment for each base cell using M=32 fixed generator points G_l: C_{ij} = \sum_{l=1}^{32} v_{ijk} G_l. These original commitments C_{0j}, ..., C_{N-1, j} (where N=128 is the number of original cells per column) can, much like the
cell_roots
of the RS + cell roots approach, be distributed in aColumnSidecar
which is itself committed to in the block.class ColumnSidecar(Container): column_index: ColumnIndex pedersen_commitments: List[PedersenCommitment, MAX_BLOB_COMMITMENTS_PER_BLOCK] kzg_proofs: List[KZGProof, MAX_BLOB_COMMITMENTS_PER_BLOCK]
- Coded Chunks & Representation: Any chunk v propagated represents a linear combination of base chunks for column j, v = \sum_i s_i v_{ij}.
- RLNC Encoding (Re-coding): A node takes input chunks (v_k, s_k)_k (where s_k = (s_{ki})_i defines v_k = \sum_i s_{ki} v_{ij}). It selects random scalars (r_k)_k, and computes:
- Coded Data: v = \sum_k r_k v_k.
- New Scalars: s' = (s'_i)_i where s'_i = \sum_k r_k s_{ki}.
- RLNC Packet: The node gossips (v, s'), in this form:
class CellSidecar(Container): cell: Cell column_index: ColumnIndex rlnc_scalars: List[Bytes32, MAX_BLOB_COMMITMENTS_PER_BLOCK]
- Verification: A node receives (v, s').
- It obtains the original base Pedersen commitments C_{ij} for column j from the
ColumnSidecar
. - It computes the expected combined commitment based on the scalars: C' = \sum_i s'_i C_{ij}.
- It directly computes the commitment for v: C = \sum_{l=1}^{32} v_l G_l.
- It verifies if C' = C.
- It obtains the original base Pedersen commitments C_{ij} for column j from the
Computational Cost
- Proposer:
- Has to compute one Pedersen Vector Commitment C_{ij} = \sum_{l=1}^{32} a_{ij,l} G_l for each original cell (i=0..N-1). This requires one M-term (32-term) MSM in the group per original cell, and 128 such MSMs per column. For all 256 columns, that’s still 32768 MSMs like in the RS + KZG and RLNC + KZG cases, but with each MSM having 32 terms instead of 128. Still prohibitively high.
- Sender (Intermediate Node re-coding): Only cheap data and scalar arithmetic (v = \sum r_k v_k, s' = \sum r_k s_k). No group arithmetic, negligible cost.
- Receiver (Verifying Packet): one 128-term MSM in the group (\sum_i s'_i C_{ij}), and one M-term (32-term) MSM in the group (\sum_{l=1}^{32} v_l G_l)
A couple of comments:
- Unlike the RLNC + KZG case, using 1-byte scalars does not reduce the computational overhead for the publisher, since this is due to having to compute the Pedersen commitments to the original cells, not related to recombined chunks
- Just like we have outsourced kzg proof computation to the tx senders, we could in principle outsource the computation of Pedersen commitments, making the computational load on the proposer very low.
Benefits
Let’s briefly discuss the advantages of this class of designs, without focusing on any the differences due to RLNC vs RS etc…
Publishing
To publish blob data, a proposer/builder can do the following:
- First send out the
HeaderSidecar
- For each column, distribute all corresponding cell sidecars to peers in that subnet, by sending each cell to a single peer, e.g., if the block contains 60 blobs (thus extended columns have 120 cells) and you have 20 peers in a given subnet, send 6 cells to each peer.
This only requires the proposer to send out 4x the original data (2x due to the horizontal extension for DAS, and then another 2x for redundancy in the propagation), rather than the current 16x (2x also due to the horizontal extension of blobs, then each column sent to 8 peers).
Note that the publishing step cannot be improved only by moving from push to announce and pull, since all messages from the publisher would be pulled. A more interesting comparison would be with the current system, sending full columns via push, augmented with batch publishing, i.e., prioritizing publishing a first copy of each column in each subnet. Even with that, having at least some redundancy for each column requires publishing at least 2 copies per column, and thus sending 4x of the data, just like with erasure coding. At the same bandwidth cost, erasure coding the columns is much more robust, and much less prone to be slowed down by a few failures.
Propagation after publishing
Overall, erasure coding lets us have a high level of redundancy and a high branching factor (how many peers we forward something to, which determines how quickly we can tap into a larger portion of the bandwidth pool of the network), without paying a huge bandwidth cost for it. In contrast, the mesh size in GossipSub can be seen as trading off between redundancy and branching factor on the one end, and duplication factor on the other end.
Distributed block publishing
With cell gossiping, anyone who has seen a blob can construct all associated cell sidecars, and thus it can contribute to cell propagation as soon as it sees the relevantBeaconBlock
or HeaderSidecar
. It does not need to know all blobs, which would instead be required to construct a DataColumnSidecar
.
Distributed reconstruction
Currently, reconstruction depends on nodes that subscribe to all subnets. If we added row subnets and allowed cells to be gossiped in them, instead of only full blobs/rows, reconstruction could happen row by row, letting any node contribute to it. Being able to gossip cells in the column subnets would then also be required in order for the reconstructed cells to move from the row subnets to the column subnets and heal the latter as well, again without the need of nodes that participate in all row subnets.
Summary table
Approach | Publisher computational cost | Per-hop work | Extra bytes per cell |
---|---|---|---|
RS + KZG | Very high ≈ 34s, RS encoding the proofs (32 768 × 128-pt G1 MSMs or 512 G1 FFTs) | Send: just forward Receive: ≈ 1.5ms, verify KZG proof (2 pairings) |
Low: +48 B (KZG proof), ~5% |
Random linear code with 1-byte scalars + KZG | Moderate ≈ 1s (32× faster MSMs) | Send: just forward Receive: ≈ 1.5ms, verify KZG proof (2 pairings) |
Low: +48 B (KZG proof), ~5% (fixed code, no scalars transmitted) |
RS + cell roots | Low (RS encode the cells, no group MSMs) | Negligible | Low: In cell sidecar: 0 Total with column sidecar: ~11% |
RLNC + KZG | Very high ≈ 68s (65 536 × 128-pt G1 MSMs) | Send: 128-pt G1 MSM (≈ 1ms) Receive: 128-pt G1 MSM + 2 pairings (≈ 2.5ms) |
Very high: +4.1 kB (scalar vector) +48 B (proof), ~4x overhead |
RLNC + KZG (1-byte scalars) | Moderate ≈ 1s (32× faster MSMs) | Send: low (MSM with 1-byte scalars) Receive: Even with scalar blow-up, cost mostly dominated by 2 pairings (≈ 2.5ms) |
High: +640 B at last hop, up to ~60% |
RLNC + Pedersen (1-byte scalars) | Very high ≈ 22s (32 768 × 32-pt MSMs in curve25519) | Send: negligible Receive: ≈ 3ms one 128-pt + one 32-pt MSM in curve25519 |
High +640 B at last hop, up to ~60% |
RLNC + Pedersen (1-byte scalars, txs include Pedersen commitments) | Low (no group MSMs) | Send: negligible Receive: ≈ 3ms one 128-pt + one 32-pt MSM in curve25519 |
High +640 B at last hop, up to ~60% |
Appendix
Linearity of KZG commitments and proofs
KZG commitments and proofs have this very useful linearity property, which we use both in RS + KZG and in RLNC + KZG, in order to generate the proof for the extension/recoded cells:
Consider evaluation points (x_j)_j and (values, commitment, proof) triples ((y_{ij})_j, C_i, \pi_i), proving that p_i(x_j) = y_{ij} for all j, for the polynomial p_i such that [p_i(s)] = C_i. Given scalars (r_i)_i, the triple ((y_j)_j, C, \pi) = ((\sum_i r_i y_{ij})_j, \sum_i r_i C_i, \sum_i r_i \pi_i) is such that C = [p(s)] is a commitment to the polynomial p(x) = \sum_i r_i p_i(x) and \pi is a valid proof that p(x_j) = y_j for all j.
For the interested readers, here’s a short explanation of this, mostly based on this great reference:
Let I_i(x) = \sum_j y_{ij} \prod_{k\neq j} \frac{x-x_k}{x_j - x_k} be the interpolation polynomial for (x_j, y_{ij})_j and Z(x) = \prod_j{(x - x_j)} be the vanishing polynomial over (x_j)_j. The proof for p_i(x) at (x_j)_j is \pi_i = [q_i(s)] with q_i(x) = \frac{p(x) - I_i(x)}{Z(x)}.
I(x) = \sum_i r_i I_i(x) is the interpolation polynomial for p(x) = \sum_i r_i p_i(x) over (x_j, y_{ij})_j, because I(x_j) = \sum_i r_i I_i(x_j) = \sum_i r_i p_i(x_j) = p(x_j).
Then, the quotient polynomial to produce the proof for p(x) at (x_j)_j is q(x) = \frac{p(x) - I(x)}{Z(x)} = \frac{\sum_i r_i p_i(x) - \sum_i r_i I_i(x)}{Z(x)} = \sum_i r_i\frac{p_i(x) - I_i(x)}{Z(x)} = \sum_i r_i q_i(x). The proof is [q(x)] = [\sum_i r_iq_i(x)] = \sum_i r_i[q_i(x)] = \sum_i r_i\pi_i.
Validation rationale
At first glance, cell sidecars seem dangerous because they can be completely arbitrary pieces of data, not tied to any block or slot or proposer. However, note first of all that a malicious node does not gain the ability to propagate a bad cell (not tied to any block) beyond its mesh peers, because forwarding requires waiting for a HeaderSidecar
, and possibly a ColumnSidecar
as well depending on which design, and performing verifications that prove that the cell is indeed tied to a block. What a malicious node can do is only to send a bad cell sidecar to a mesh peer.
It might appear as though even this weaker form of malicious behavior is not possible with a DataColumnSidecar
, at least not without being downscored, because a column sidecar contains the equivalent of a HeaderSidecar
and is thus independently verifiable. In fact, this is not quite the case: verification of a column sidecar hinges upon verification of the signed beacon block header it contains, and this has a dependency on its chain: if you don’t know the shuffle for the chain to which the header belongs, you cannot verify it.
Concretely, a peer can always send you a header with a valid signature but a made up parent root, such that you can never retrieve its chain and figure out if the proposer is the correct one. Therefore, you also can’t be sure that it is invalid, and any downscoring would at best have to depend on timing, e.g., you downscore if the message stays unverifiable for too long, because you cannot retrieve its dependencies. This is exactly the same situation that a node would find itself in if it received a cell sidecar such that it can’t retrieve a corresponding HeaderSidecar
and possibly ColumnSidecar
.