EVM Idea: Add access to overflow, carry, sign and zero flags to reduce gas use

nootropicat · January 18, 2018, 6:12am

Motivation:

lack of overflow flag means checking for overflow wastes gas and storage on pointless comparisons
lack of carry flag makes >256 bit arithmetic slow because it has to be reimplemented via comparisons

If these two flags were to be added there’s no reason not to add two already existing flags - sign (is the result negative or not) and zero; useful for optimizations in hll compilation in some instances (especially checking if a result of an arithmetic operation is negative or zero).

Performance impact: positive with jit, negligible with interpreter. Every arithmetic operation on x86/amd64 already sets these flags, they only have to be made available. Arm has these flags too and can update them for bitwise/arithmetic instructions except div/mul. In jit cpu flags can be accessed directly, making them available in evm means less pointless comparisons, resulting in less cpu instructions.

How:

make every arithmetic&bitwise evm op change the value of a byte register with four one-bit flags: carry, zero, sign, overflow.
add a PUSHF opcode that pushes the flags byte.

This is enough to make multiple-precision arithmetic and overflow checking faster & easier.

Small extension 1:
Add an ‘overflow_throw’ flag into the flags registers + a POPF opcode to set it. When the overflow_throws flag is set arithmetic opcodes call revert (or throw if some exception handling was added in the future to evm) on overflow. That would directly emulate current assert on overflow behavior.

Larger extension 2:
Add cmp + conditional jump instructions.
Why - lack of flag-setting comparison and flag-conditional jumps makes repeated comparisons wasteful, consider this example:

if(a == b) { s1; }
else if(a < b) { s2; }
else { s3; }

which currently translates to something like this:

mload b
mload a
dup2
dup2
eq
push s1
jumpi
lt
push s2
jumpi
;; s3 here

with cmp+jcc instructions like on a x86/arm cpu:

mload b
mload a
cmp ;;compares and sets flags
jz s1
jb s2
;; s3 here

yhirai · January 18, 2018, 9:56am

Do the CPUs have those flags for 256-bit arithmetics?

nootropicat · January 18, 2018, 11:45pm

They are equivalent to flags from the last operation. Except the zero flag, that would have to be an and of flags from all (64bit?) subparts

vbuterin · January 20, 2018, 6:07pm

What does webassembly look like in this regard?

nootropicat · January 22, 2018, 2:19pm

No flags. Interestingly enough someone made a nearly identical proposal to add them.

github.com/WebAssembly/design

Arithmetic with carry

opened 05:46AM - 28 Mar 17 UTC

ghost

(This is an expanded version of issue (https://github.com/WebAssembly/spec/issue…s/446)) Multi-precision arithmetic requires special handling. This is available in some form in all ISAs, but not currently available in webasm. There are basically three ways of doing this (it seems to me), none of which is especially palatable in the context of the current design: Option 1. Add a special flags register, together with instructions that do arithmetic with that register. This is the traditional approach in mainstream hardware. You get instructions like addc which will add two numbers, and the value of the carry flag, and set the carry flag on the result. This is not pleasant because it adds a special register. Option 2. Support a form of 60bit arithmetic with 4 bits for flags. Option 3. Support multi-word arithmetic: addition takes 3 arguments and produces 2 outputs. This approach will do the least damage to the existing ISA and register architecture. However, this will be hard to map to existing hardware efficiently (not all hardware ISAs support loading the flags register directly. However it is done, multi-precision arithmetic is quite important and very hard to emulate without some simple hardware support.

From Lars Hansen’s comment:

IMO this functionality is fairly important, and I think the overflow flag is as important as the carry flag though with different use cases (dynamic languages’ fixnum → bignum transition).

That’s a nice optimization idea. It would require the ability to set custom handlers though.