RNG exploitability analysis assuming pure RANDAO-based main chain

I think there are ways that we could work on the RANDAO game in order to make it less long-run exploitable, by making it so that there is no net payoff to getting extra validation slots. Here is how this could be done.

Suppose that a validator on average gets selected once per T blocks, and gets a reward of R. A “naive RANDAO” gives them a reward of R for participation, and 0 for non-participation. A slightly less naive construction detects non-participation (ie. when you get selected and don’t make a block), and gives a reward of -R in that case.

We can move part or all of the reward into a passive reward that increments over time, and rely almost exclusively on penalizing non-participation. Suppose on average validators show up p of the time (eg. p = 0.95). A validator would get a reward of p * R every T blocks (this can be calculated via deposit scale factors like in Casper FFG), then if they get selected, they get a reward of (1-p) * R if they make a block, and are penalized (1 + p) * R if they no-show. This achieves the following properties:

  • Expected long-run reward structure same as before (before = giving +R for participation and -R for no-show every time a validator is selected)
  • Same incentive as before to make a block as opposed to no-showing, not taking into account RNG manipulation
  • For an average-performing validator, no incentive to manipulate randomness

Another way to think about this is, we remove incentives for RNG manipulation by making a system where instead of rewards coming from taking advantage of opportunities to make blocks, rewards mostly simply come from just having your deposit there, and penalties come from not taking advantage of opportunities to make blocks.