Toward Semantic Block Chunking

Nero_eth · October 13, 2025, 12:04pm

Adding some napkin math to clarify things further:

Semantic chunking lets validators start executing after the first chunk arrives instead of waiting for the full block. We assume the following parameters:

150M gas limit → chunk cap 2^{24} → 9 chunks
1 s execution per block → exec/chunk e = E/k \approx 0.111 s (111 ms)
50/25 Mbit/s down/up links (5.960 MiB/s, 2.980 MiB/s)
2 MiB payloads

Also check out this simulation tool, allowing you to compare propagation-to-execute delays for big monolithic blocks vs multiple smaller ones.

Quick Math

The following is a rough estimate based on a few strong assumptions. It gets the point across, but take the numbers with a grain of salt.

First, we’re still ignoring networking hops and only look at the download:

Full download (2 MiB):
D = \frac{2}{5.960} \approx 0.3356 s
Per-chunk size: s_c = \frac{2}{9} \approx 0.222 MiB
- Per-chunk download: d_0 = \frac{s_c}{5.960} \approx 0.0373 s
- Per-chunk upload: \frac{s_c}{2.980} \approx 0.0746 s
Regime: compute-bound 111 ms exec ≫ 37 ms download

Latencies

Monolithic: T_{\text{mono}} = D + E \approx 0.3356 + 1.0000 = 1.3356 s
Chunked (streaming): T_{\text{chunk}} = d_0 + E \approx 0.0373 + 1.0000 = 1.0373 s
Gain: ~0.298 s per block

Why: Streaming hides all later downloads under execution/validation; only the first chunk’s transfer remains on the critical path.

Multi-Hop Store-and-Forward Simulation

Model: Each hop performs download then upload (store-and-forward).

Monolithic must forward the entire 2 MiB every hop, then execute.
Chunked forwards first chunk through the hops, starts execution, and overlaps the rest under execution (compute-bound).

Closed Forms

Per-hop payload time (monolithic):
\frac{2}{5.960} + \frac{2}{2.980} \approx 0.3356 + 0.6711 = 1.0066 s
\Rightarrow T_{\text{mono}}(H) = 1 + 1.0066 \times H
First-chunk per-hop time (chunked):
\frac{2/9}{5.960} + \frac{2/9}{2.980} \approx 0.0373 + 0.0746 = 0.1118 s
Inter-arrival at destination = \frac{2/9}{2.980} \approx 0.0746 s; since e=0.111 \ge 0.0746 ⇒ compute-bound
\Rightarrow T_{\text{chunk}}(H) = 1 + 0.1118 \times H

Results

Hops	Monolithic (s)	Chunked (s)	Gain (s)
1	2.0066	1.1118	0.8948
2	3.0133	1.2236	1.7896
4	5.0265	1.4472	3.5793
6	7.0398	1.6711	5.3690
8	9.0531	1.8948	7.1586

Why it scales with H: Monolithic pays full store-and-forward per hop; chunking puts only the first chunk’s store-and-forward on the critical path and overlaps everything else under execution.

Simulation Code

# Params
MiB = 1024**2
down_Mbps, up_Mbps = 50_000_000, 25_000_000
down_MiBs = (down_Mbps/8) / MiB     # ≈ 5.960 MiB/s
up_MiBs   = (up_Mbps/8) / MiB       # ≈ 2.980 MiB/s

S_exec = 2.0                        # MiB on critical path
E, k   = 1.0, 9
e      = E / k                      # 0.111 s/chunk
s_c    = S_exec / k                 # 0.222 MiB/chunk

def mono(H):
    return E + H * (S_exec/down_MiBs + S_exec/up_MiBs)

def chunked(H):
    first = H * (s_c/down_MiBs + s_c/up_MiBs)
    inter = s_c / up_MiBs
    if e >= inter:                  # compute-bound
        return first + E
    else:
        return first + (k-1)*inter + e

for H in [1,2,4,6,8]:
    print(H, round(mono(H),4), round(chunked(H),4),
          round(mono(H)-chunked(H),4))