Exploring the proposer/collator split

Our expectation is that they will remove themselves since collators will basically ignore them. If we really think the proposer–collator model is valuable, then it needs to be incentivised somehow (e.g. by punishing the inclusion of spam/invalid transactions, for example, if we can even define what invalid means in the new regime).

Which is no worse than the situation we are in on Mainnet today.

Thanks for the typos. Will correct!

1 Like

I realize that proposers would degenerate anyway under your assumptions which seem reasonable, although it’s hard to assess how they would hold in practice, and edited my comment to make a note of that. But having collator-proposer-executors in a sharding context is not the same as the current situation, since we are trying to remove the need for a node to have to process every transaction. What’s the point of sharding if we don’t remove the need to do that?

Wouldn’t it be too risky for a self-selecting collator to not verify transactions prior to including them in a collator? It would be trivial for an attack to flood this collator with junk transactions reducing T_n to zero.

It would also be too expensive for a self-selecting collator to maintain state on all shards for the infrequent chance that they are a selected as a collator for any shard. The self-selecting collator would have 101x c_n of a single proposer.

2 Likes

Yes, as per the “High-spam regime” section, this may be the best argument for having the proposer/collator split. Though, note that the risk to the collator is only opportunity cost. It does not lose funds (under the current sharding proposals), and also gets the collation reward. If the collator has either (a) relatively lightweight heuristics for identifying spam Txs without knowledge of state (think email spam filters), or (b) access to a stateless client, then it can still likely do better than relying on proposers’s bids.

I think what I said is that the proposers are likely to be executors.

Ah OK, sorry for misquoting you. That makes sense, since they would execute the transactions anyway to verify that they are valid.

Email spam filters have large datasets with labelled data. How will this be provided without knowing state data?
How would the collator have any sense of precision or accuracy in spam filtering?

How so?

Hi Ben,

What do you think of Vitalik’s post? It seems one could keep the validator’s fee fixed, and get rid of the bidding process altogether by just rotating proposers. That seems to align incentives (as on average proposers will not fill the state with junk transactions), and obviates the need for a validator to keep any global state.
I wonder if you could even generate schemes that decommission proposers who turn up with low value transactions compared to the average proposal value.

I think a good long term justification for the split between collators and proposers is that if you had really large GB sized blocks, just about the only thing a collator could do with such a huge block in such a short period of time is check for availability. Even access list checking would take too long.

Yes, admittedly, this is hand-wavy. The main point is that this regime is currently planned to be temporary (where the proposer has access to full state, and the collator doesn’t, so they are on an unequal footing). When stateless clients appear, they move to a more equal footing, hence…

This is covered in the “High-spam regimes and stateless clients” section in the article.

The one in this thread? I’d like to see more detail, especially around how data availability is guaranteed if there is only one proposer per period. But sounds interesting.

I second this suggestion. There seems to be little reason to have a separate proposer section when all it is really doing is just collecting blobs into collations. It makes sense to separate the tasks logically, but if collators don’t need to store the entire state of each shard then I don’t really see the need to encourage the tasks to be distributed physically. In my opinion a Proposer-Collator-Executor is not really so bad given that the performance requirements of such nodes are generally accessible to most laptops (running Casper).

Is the concern that running a proposer and collator simultaneously would be too difficult to achieve a large number of nodes?

I would certainly think so. As a single self-electing collector will need to cover 101x proposer’s performance requirement right?

I would certainly think so. As a single self-electing collector will need to cover 101x proposer’s performance requirement right?

I don’t see why that would be the case. At any given period a collator is only concerned with one shard, and only has to deal with proposers of that shard. It doesn’t make sense to collect proposals from other shards than the one you are assigned to. Since you have the lookahead period, you can get ready for that shard ahead of time. Yes there are some storage requirements (such as the state of the assigned shard), but nothing I think would be drastic in changing the number of participating nodes.

Could you elaborate on that figure of 101x?

This assumes the proposer has to maintain the state for a given shard. A collator has no idea which shard it will be selected to commit a collation so it must maintain the state for all 100 shards. A collator must do this to ensure that the transactions would execute properly before including them in the collation body or they risk including a junk transaction and forfeit some of T_n. We also must assume that it would not be possible to download the entire state of a shard within a lookahead period (how long does it take to sync today? 8 hours?) so the collator must maintain state for each shard all the time.

However, Ben claims that the proposer/collator can determine whether a given transaction will execute properly without maintaining state by way of a stateless client. If this is a reliable method then the self-proposing collator’s cost would be a low constant given that the stateless client paradigm can be trusted enough to validate transactions.

So either the self-proposing collator’s cost will be either 101 times that of a proposer or a negligible low fixed cost by using a stateless client.

Link to newer thoughts: A general framework of overhead and finality time in sharding, and a proposal

2 Likes

One potential issue is that, since ε_n may be negative (a proposer may subsidise a proposal), it is possible for a proposer to censor transactions on a shard for arbitrary lengths of time.

I am not convinced that this is an issue. Consider that in the current system, it is possible to simply send 8 million gas transactions with a high gasprice, which seems like it would have a similar effect.

The non-full shard

Agree that in these cases state is not required for self-proposals.

for example, using the order in which transactions appeared in the shard’s transaction pool.

This is actually an interesting hidden insight: you can use nodes in the network to filter out non-fee-paying transactions for you for free, and use this as a source of transaction data. Though this technique is likely to be quite imperfect.

In Phase 1 sharding, there is no concept of a “spam” or invalid transaction

To clarify, there is never a concept of an invalid transaction at the collation finalization layer.

This model degenerates either to there being only one super-efficient (or malicious) proposer per shard

Not necessarily. I would argue that if there is only one proposer, then that proposer gets the incentive to start rent-seeking (increasing \epsilon), and that by itself creates the incentive for more proposers to undercut. It seems like the Nash equilibrium is that there is always some nonzero probability for the dominant proposer to lose any particular bidding round, which means multiple proposers. Also, there is the possibility of proposers that represent specific applications, as well as the possibility of proposers that acquire specialized domain knowledge about fee payment in specific applications (eg. accepting fees in E-DOGE).

It’s additionally worth pointing out that if the dominant proposer tries censoring, then that by itself confers an economic advantage to all of the other proposers.

I would be interested to hear what you think about the proposal/notarization separation model that I outline in the newer post I linked (A general framework of overhead and finality time in sharding, and a proposal).

1 Like

Couldn’t censoring by proposers be much more subtle and efficient than this rather blunt instrument? For example, a proposer could easily hold a business to ransom by selectively filtering transactions to its contracts. The proposer needn’t even lose out on Tx fees if there are alternative Txs it could include instead (to address the later point). DoS attacks are a thing, and it would be a shame to make it easy. The company’s defence is to spin up its own subsidised proposer to rescue its Txs, or tell its users to use higher fees, and escalate from there… Anyway, it seems undesirable.

Which means that, in this stateless client regime, self-proposals become the dominant strategy. QED.

Yep, will be taking a look at it.

1 Like

As you say this is flawed, it leads to a tragedy of the commons by exploiting the use of a resource at no internalized cost, but the externalized cost is nevertheless pushed onto the commons. http://eips.ethereum.org/EIPS/eip-908

1 Like

Reading the Truebit white paper I see parallels in this statement:

As an example of the second type, consider a situation where the Solver deposit is small (say 10 ETH) but the expected jackpot payout per task is high (say 1000 ETH). An individual playing both the role of the Solver and Verifier could offer a bogus solution and then challenge his own answer, hypothetically netting, on average, 1000 − 10 = 990 ETH without providing any useful service. Such an action would degrade other Verifiers’ incentive to participate.