Relay Inclusion Lists

aelowsson · April 29, 2025, 11:52pm

Interesting idea. It should be noted that to empower the protocol—through its validators—to enforce censorship resistance, something like FOCIL would still be needed, and that the referenced IL-boost mechanism gives validators a more direct influence. Yet this can be a straightforward way to promote light censorship resistance at the current stage, which of course is welcome. A contradiction to note is that censorship resistance will be imposed by a single party. There may also be risks with the resulting reliance on the mempool specified by the relay. To increase openness, the mempool producing the IL would ideally be made available at the same time that the IL is. This will not prove that the relay operates honestly, but makes it much more difficult not to.

I will in my review seek some clarifications on technical details and outline potential improvements.

kubimens:

For each transaction t in the mempool, compute:
S_w(t) = \frac{w(t)}{\tilde{w}}, \quad S_f(t) = \frac{f(t)}{\tilde{f}}
…
\tilde{w} and \tilde{f} are the medians of the waiting times and priority fees respectively, for all transactions pending in the mempool.
…
Normalizing via the median avoids skew from transactions that have not seen inclusion for economical reasons, i.e. due to underpayment. The rule is computationally efficient, as median calculation and transaction sorting can be performed quickly for typical mempool sizes using standard algorithms. An adversary attempting to grief the median by spamming low fee transactions would simply increase the prioritization score of other transactions; a simpler and cheaper way of reaching inclusion would just be to pay more.

I will try to analyze the options to see if I understand your intentions correctly, and to also explain the rationale behind alternative designs.

A benefit of the normalization step is that it produces an “unopinionated” balance between priority fees and delay. It also adjusts according to the state of the mempool. Two examples: if the delay to inclusion increases in the mempool, the importance of the delay is weighted down; if there is a spike in priority fees, the importance of the priority fee is reduced relative to the delay. Normalization can thus be favored if this type of balancing is desirable. However, it would then seem optimal to apply the normalization only to a subset of txs that have some sufficient max_fee_per_gas, for example ensuring max_fee_per_gas * 2 > base_fee_per_gas. You might even consider max_fee_per_gas > base_fee_per_gas. This option has a clear interpretation: if there is space, all txs are with a sufficient max_fee_per_gas are included, and otherwise, the selection is still based only on the distribution of such txs.

The thresholding is to make it more difficult for adversaries to alter the balance between delay and priority fee with spam txs. As an example, you can otherwise make the mechanism prioritize priority fee over delay by filling up the mempool with txs with a low max_fee_per_gas and a low max_priority_fee_per_gas, and let them sit in the mempool accruing delay.

It can be noted that directly computing a score to rank txs by, and including the subset with the highest score, would already be sufficient. It is only really necessary to incorporate a normalization step if you wish to specifically balance the influence of priority fees and delay according to the present state of the mempool, when the two variables after normalization are summed. Another way to explain this is to say that normalization is not necessarily required to handle for example a spike in priority fees, given that all txs will be compared with each other (thus a relative operation) anyway in order to select the most relevant for inclusion.

It would therefore as an alternative be perfectly theoretically sound to not normalize, just computing a score from, e.g.,

S(t) = a\,w(t)+f(t)

or

S(t) = (w(t)+a)\times(f(t)+b),

where a and b are weights that can be used to specify a sought balance between delay and priority fee. Setting both a and b to 0 in the second equation then yields

S(t) = w(t) \times f(t).

A doubling of one variable then has exactly the same impact on ranking as a doubling of the other. If a tx provides a 0 priority fee, it cannot be included. This is perfectly reasonable, and, actually, probably desirable. A tx also cannot be included the exact moment it arrives. This may be undesirable, and some weight a can be added to enable direct inclusion, albeit a delay will of course also quickly accrue otherwise. Such an equation would look like this:

S(t) = (w(t)+a)\times(f(t)).

Foregoing normalization has the benefit of making it more difficult for the relay to influence the outcome. This can actually be very important, given the inherently centralized nature of the design.

Here I would also like to understand your argument and the intention behind the step. What are “small” and “large” transactions? Is it those that consume a lot of gas, or those that have a large raw byte size? Or is the correlation between the two somehow a part of the rationale? For example, it seems to me that ranking by the density score does not actually guarantee

given that blockspace is measured in gas, not byte size. To achieve a higher marginal price per unit of blockspace consumption, the requirement would be to account for gas g(t) used by the tx, computing the density score as

D(t) = \frac{S(t)}{g(t)}.

It can be clarifying to outline various alternatives, focusing on how the priority fee relates or not relates to byte size:

Priority fee per gas – This is a neutral approach w.r.t. the space that the tx occupies in the block—where space is defined by the gas limit. It is simply a ranking by priority fee, since “per gas” is implied. This is used in for example FOCIL with ranked transactions (FOCILR), where a ranking by priority fee (potentially combined with other measures) serves to keep txs in or out.
Priority fee per byte – This is a neutral approach w.r.t. the space that the tx occupies in the actual IL—where space is defined by the size limit of the IL. The total priority fee accrued from a tx is then divided by its byte size. This is, e.g., similar to the function that validators would likely prioritize txs by under FOCILR, given that they wish to squeeze out as high rewards as possible from their IL.
Priority fee per gas/byte size – This is the approach in this post (accounting also for delay). It will favor txs with small byte sizes (as also indicated). It will also tend to favor txs with low gas consumption, given that gas consumption correlates with byte size.
Priority fee per gas/gas used – This is an approach that guarantees a higher marginal price per unit of blockspace consumption. It will favor txs with low gas consumption. It will also tend to favor txs with low byte size, given that gas consumption correlates with byte size.

Is there some specific reason why transactions with small byte sizes are prioritized for censorship resistance? I see the argument of favoring as many originators as possible, but this would in a more targeted way be accomplished by (4) not (3). It can also be argued that the most neutral approach to censorship resistance is to not take an opinionated stance on which txs that should be prioritized for inclusion. Is it perhaps the limit of 8 kb that ultimately motivates (3) over (1) or (4)? This limit is arguably self-imposed and not really a requirement for the relay, given that IL propagation is not design-critical, as opposed to in FOCIL.

How will the mechanism handle full blocks? Can the builder ignore the IL if the block is full? It appears from the text as if all highly ranked txs must be included in the block, regardless of if the block is full or not—that is to say, the design imposes the same stronger censorship resistance conditions as imposed by FOCILR. If you are pursuing these stronger censorship resistance properties, it might however be reasonable to apply a gas threshold to the IL (and by extension the block). This would encourage more usage of the relay, which could otherwise forego too much value.

This makes it even more important to clarify and analyze how the mechanism intends to deal with full blocks, and might make a design favoring (3) above less compelling.

Relays integrating with builders will then seek to influence the aggregate list to maximize the value that their builder can extract; something to ponder on a bit. It is furthermore not perfectly clear to me that censorship resistance would improve, due to increased collusion concerns.