I will explain in more detail how we can do an attack on the original version.
I assume that it possible to cheat on a large majority of the constraints except that the values that go into the FRI are not on a low degree polynomial. So we have 16X values that are supposed to be on a X degree polynomial, but are actually on a 16X degree polynomial. If we build the FRI proof with these values we will get caught at the final step because the 256 final values are not on a 16 degree polynomial, but are on a 256 degree polynomial.
The FRI proof is organized in such a way that we can separate the 16X values into 256 groups. Each group of values are used to calculate one of the 256 final values. These groups are independent in the sense that we can change the values in one group, without causing checks in other groups to fail. Changing values in one group will off-course change which rows/columns are chosen, but all checks on all untouched groups will pass. If we only change the values of a group at one stage of the recursion then that will only cause the checks to fail if a row from that group is chosen at that specific stage of the recursion. They will all pass at every other stage.
To cheat on the FRY we will pick 16 groups with a total of X values and determine what the values in all other groups need to be if they were on the same polynomial as the chosen X. This is our target polynomial. Note that if we would use the target polynomial to start the FRI, we would pass the check on the final 256 values, but it is unlikely that we would pass the first checks, because we have changed 15X of the 16X values. So instead of changing 15X values we will only change a few groups at the first step of the recursion to the target polynomial. Since only a small percentage of the values have changed we have a pretty good chance of not to get caught.
If we change 24 groups and do 40 checks we have a ((256-24)/256)^{40} = 1.9\% chance to pass all checks. We do not have to try many variations of the tree to get away with it.
After we found a solution that passes all checks, we will go to the next level of the recursion and we will change another 24 groups. After 10 recursions all groups are on the same low degree polynomial and we will pass the check on the 256 values.
It is easy to see that our proposals are not vulnerable to this attack, because we will always check at every stage of the recursion and we will spot any changes made.
The attack on your proposal is different but is based on the same principle: Try many variations of the tree and use the best one to go to the next level of the recursion.
It depends on the implementation, but the 160 values from which 40 are chosen are probably ordered in a specific way. In a basic implementation every set of four values come from the same row and are ordered from low to high. Knowing this, we could try many variations of the tree and choose the tree that would pick as many low values a possible. Having done this at every level of the recursion, some values are far more likely to be checked than others. If we chose our X values to be the ones that are found via a path of mainly low values we are more likely to pass all checks.
Regarding optimization 3. The idea is to not publish the 256 values or any merkle branches.
The 40 final values are calculated based on 160 values of the level below and the chosen column.
These 40 final values are supposed to be on a 16 degree polynomial, so we only check if that is the case. This could lower the security of the proof a little because we will accept any 16 degree polynomial, but I don’t think it would help an attacker a lot to be able to use multiple polynomials in the final stage that are not to be mixed.
Regarding optimization 4. The idea is to use more values at the start of the FRY proof (like 32X instead of 16X) and less checks per level of recursion, but as I read above, you have already considered that.