The language design, that on EVM v.s. that on Others

Synthetic Way and Analytic Way

The most of us, including other formal verification project researchers, are relying on the “analytic” way of program verification. That is, program first, verification later. However, using such way means we write a program in a certain language and after that we verify the program satisfying the specification we want; the programming language has no restriction about their programs that can occur with respect to the specification. On the other hand, in using “synthetic” languages such as Coq, we specify the specification precisely and first. Thanking that there is Generalization of Curry-Howard Correspondence, program can be generated if and only if we proved that the program specification is right.

Here we saw two opposite ways of program verification, analytic v.s. synthetic.
I would not mention which is better because the both ways have each weak points.
e.g. synthetic way always can have trivial program. (Even if we write specifications S_1, ... S_n , and if we forget to add a specification S_{n+1} which we need, we can always have the trivial synthetic program. And, the proving technique in synthetic language such as Coq is sometimes much harder than formal verification of program analysis using approximation. )

Guard Condition

Concerning Pen Compiler, currently we have defined Syntax Tree, its Program Graph, and the representation matrix, that is, we have been developing the analytic side.
Particularly, concerning Program Graph, we did not decide how to handle loops in the graph.
One of the analytic way of loop verification is “invariant analyisis” which is studied in the theory of Hoare Logic. However, invariant analysis needs that we have to specify what the invariant is and it might be difficult to find it in cases. Then how to handle loops? For the question, we would like to apply synthetic way to pen compiler. It’s called “guard condition”.
Guard condition asserts that the loop branch condition has inductive data structure and each time we reach the condition, we have to shrink the structure, and it must shrink to just a “point structure”, breaking the loop.
For example, Coq compiler uses the guard condition in order that it assures that programs in Coq stops, and it is statically analyzed.

Many compilers, such as solidity, use Integer, but we might not need it.
Rather, we have Natural Numbers( , positive numbers, and integers as in Coq style).
Negative numbers can be represented by natural numbers in double-entry bookkeeping.
We might not need integers in formal ways.

We sometimes say loop branch condition is “noetherian” if it satisfies the guard condition, since the condition sequence is descending and has the end, as descending sequences in algebraic geometry is noetherian if it has the end point named from a great mathematician “Emmy Noether”.

Formal Opcodes

In such assumptions, we cannot use EVM’s ADD Opcode directory in adding two numbers since ADD works exactly as defined in EVM.
We have inductive data structures and we have to embed them into EVM. Since embedded EVM code works and it is called in EVM, static checking does not mean even if we check the code carefully in compile time. So, in embedding data structure, we have to define Formal Addition FADD to be addition with boundary condition check, FMUL to be multiplication with boundary condition check, and so on. Of course, the gas prices of the Formal Opcodes should be cheap.

MSO Logic and Formal Verification

First Order Logic is a Logic with first order variables i.e. variables x that we assign values.
Monadic Second Order (MSO) Logic is a First-order theory with limited use of second order variable X .
MSO logics can represent assumptions which some sort of struectures such as words, or trees need satisfying. It’s a formal logic of structures.
For brevity, one can consider “formal lexer” and “formal parser” with them.

Automata Theories

Given a word, an automaton returns the word can be accepted by it or not.
This is well known automata, which is used in lexers.
There is some variants, weighted one, transducers, infinite ones, tree automata, and, so on.
Particularly, given a tree, a tree automaton returns the tree can be accepted by it or not.
In automata theories, there is one to one correspondence between (word) automata and regular langauges (regular expressions). Besides, there is another correspondence tree automata and regular tree languages.

MSO Logic

Speaking of MSO Theory, surprisingly, there is another one to one correspondence.
Given a MSO Logic formulae, one can construct an automaton which accepts only the words satisfying the formulae. Of course, there is a tree-MSO version.
In the blog, I explain the detail of the conversion from MSO Logic to Automata more formally. See https://ghassheee.github.io/ethereum/510-MSO-Logic.md/.

Applications

Like MSO generated automata can decide that a word can satisfy the MSO Logic formulae,
It is well known that Linear Temporal Logic (LTL) can generat automata, deciding whether a program can satisfy the LTL formulae.
Various researches over this MSO-automata correspondence are now published.
See, e.g. https://janahofmann.github.io/files/dissertation_jana_hofmann.pdf

The Solver Architecture

For the sake of developping solver architecture of logical problems whether a program satisfies an assumption or a proposition, now(In coming months), pen compiler has (will have) Automata.hs (AlternatingAutomata.hs or SomeAutomata.hs) and MSO.hs (LTL.hs or SomeLogic.hs ) which can decide whether a word (a program) satisfy a MSO logic ( logical assumptions that the program must satisfy ).


Updates: I added the blog link which explains MSO-Automata Conversion formally .

EVM-Dafny

I have just known this project which seems to have quite clear and nice architecture.
Evm-dafny enables us to be free from bugs by implementing EVM semantics via Dafny langauge.
Elaboration of verification part is thrown to the Dafny language and we can thank both synthetic verification and analytic verification when we write contract using EVM-Dafny. I would like to try it.
Has anyone already tried it ?

There are some posts around EVM-Dafny language.

1 Like