Latest Casper Basics. Tear it apart


#22

Amazing work! :grinning:

Here’s some suggestions IMHO:

  1. [1 Introduction, page 1, the 3rd paragraph]

    is based on a thirty year old body of research into BFT consensus algorithms such as PBFT

    → is based on a thirty-year-old body of research into BFT consensus algorithms such as PBFT

  2. [2. The Casper Protocol, page 4, Figure2]

    Violate either of these and you lose your deposit.

    → Validators’ deposit would be slashed if they violate either of these commandments.

  3. [3. Enabling Dynamic Validator Sets, page 6, the 6th paragraph]

    • The description of A checkpoint c is called finalized… is duplicated.
  4. [4.1. Long Range Revisions, page 7, the 3rd paragraph]

    Hence, as long as ω ≥ 4δ, it’s guaranteed that malfeasant validators will lose their deposits in all chains that any client accepts.

    • It would be more clear to define the notation ω first. Maybe it could be described in the above line: “so they reject blocks with timestamp ω > 4δ”? (or maybe I misunderstood…?)
    • Could you clarify the difference between ω ≥ 4δ and ω > 4δ?
  5. [4.2. Castastrophic Crashes, page 8, the 5th paragraph]

    malfesant behavior

    malfeasant behavior


#23

Initial comments on Section 2.1 (still working my way through the proofs) -

  • Add figure(s) illustrating examples of the two Casper Commandments, i.e. show examples of what violating each commandment would look like.

  • Clarify the “any previous events” statement by providing examples of possible previous events.

  • Draw figures(s) illustrating i-iv, showing examples of what the converse would look like and explain it, e.g. h(s1) < h(s2) < h(t2) < h(t1), because link h(s2) -> h(t2) would be “contained” within h(s1) -> h(t1).

A couple less important (to me) suggestions -

  • Use of “t” makes me keep having to remind myself “t” doesn’t refer to time.

  • The switch between using s/t and a/b makes things a bit unclear, although I understand the purpose for the switch.


#24

Accepted all changes.

Changed everything to \omega > 4\delta. There’s no substantial difference between \omega \geq 4\delta and \omega > 4\delta.

Also added definition of \omega,

Hence, as long as \omega > 4\delta, it’s guaranteed that malfeasant validators will lose their deposits in all chains that any client accepts, where \omega is the ``withdraw delay’’ (\figref{fig:longrange}), the delay between the end-epoch and when validators actually receive their deposits back.


#25

Here are the changes thus far.

https://www.diffchecker.com/BycSfhsM


#26

Post a PDF of the latest version in the header before the original version, so that people will comment on that and avoid doubling up on the same feedback.

Sorry I can’t make any more comments or tag any more people.

Grammar and readability edits
Please have the font in Arial or some other Sans Serif font.

Use CTRL+F to find the text that I’m quoting.

"when we say “2/3 of validators”, we are referring to the deposit-weighted fraction; that is, a set of validators whose sum deposit size equals to 2/3 of the total deposit size of the entire set of validators.” Maybe to avoid ambiguity have an acronym, e.g. validators with 2/3 of the total deposit size (VaTTTDS, pronounced vat dees).

The most notable property of Casper is that it is impossible for two conflicting checkpoints to be finalized without >1/3 of the validators violating one of the two Casper Commandments/slashing conditions."

…it is impossible for both of two conflicting checkpoints to be finalized unless >1/3 of the validators (by weight) violate one of…

think “four months’ worth of blocks”

should be four months worth of blocks. I don’t think you need the quotation marks. It doesn’t need “think”. Be decisive or say probably four months…

@chris-remus, “recursively justifying” should be defined as it is in the last bullet point of p. 2. However if this document uses usepackage{hyperref} then you can use \hypertarget{recursivelyjustify}{} in a line before the last bullet point of p. 2 (e.g. as below), then use \hyperlink{recursivelyjustify}{recursively justifying}.

188.     \hypertarget{recursivelyjustify}{}
189.	\item A checkpoint $c$ is called \emph{justified} if (1) it is the root, or (2) there exists a supermajority link $c^\prime \to c$ where $c^\prime$ is justified.  \figref{fig:2c} shows a chain of four justified blocks.

314.	Before, a checkpoint $c$ is called \emph{finalized} if it is justified and there is a supermajority link from $c$ to any of its direct children in the checkpoint tree.  Finalization now has one additional condition---$c$ is finalized only if the votes for the supermajority link $c \rightarrow c^\prime$, as well as all of the supermajority links \hyperlink{recursivelyjustify}{recursively justifying} $c$, are included in the block chain before the child of $c^\prime$, i.e., before block number $\h(c^\prime) * 100$.		

Yes I agree to have a different notation to \upnu for a validator set, a capital italic letter as used here.

Note this means that the forward validator set of dynasty d is the rear validator set of dynasty d + 1.

It’s probably better to spell this out for readability, i.e.:

\begin{align*}
   \mathcal{V}_{\operatorname{r}}(d+1) &\equiv \left\{ \upnu : \DS(\upnu) < d + 1 \leq \DE(\upnu)  \right\}  \\	
   \equiv \left\{ \upnu : \DS(\upnu) \leq d < \DE(\upnu)  \right\}
   \equiv \mathcal{V}_{\operatorname{f}}(d)
\end{align*}

There is repetition of two sentences:

We add the condition that c is finalized only if the votes for the supermajority link c -> c′, as well as the supermajority link justifying c, are included in c′’s block chain and before the child of c′—i.e., before block number h(c′) * 100. Isn’t the last phrase “and before the child of c′—i.e., before block number h(c′) * 100” repeating the previous condition, since including these votes in c′ implies that the votes will be included before the child of c′.

We assume all clients have local clocks are perfectly synchronized (any discrepancy can be treated asbeing part of the communication delay).

We assume that all clients have local clocks that are perfectly synchronized (any discrepancy can be treated as being part of the communication delay \delta).

@hwwang

Could you clarify the difference between ω ≥ 4δ and ω > 4δ?

Yes I also had to stop and reflect about that, so a clarification would be good. For the case where ω ≥ 4δ, my understanding is that the validator can withdraw the deposit at that point in time, but they will have no time to launch a successful attack after that as it will be ignored by clients, since ω > 4δ.

If anyone knows how to use strikethrough in these comments, please let me know how.

If a validator sees a slashing violation at time t (that’s the time they hear the later of the two votes),

Which two votes? In a validator’s vote, there is a vote for the source and target checkpoints as their hash and height. So I’m assuming this means that the time of the target checkpoint. Perhaps this should be clarified, e.g.:

If a validator sees a slashing violation at time t (that’s the time they hear the later of the two votes, i.e. the timestamp of the target checkpoint)

Suppose that a large set of slashing violations results in results in two conflicting finalized checkpoints

Suppose that a large set of slashing violations results in two conflicting finalized checkpoints

whose chains that have not yet included the evidence transaction

whose chains have not yet included the evidence transaction

slashing evidence was submitted into a given chain “on time” or submittedand another chain as having accepted it too late

submitted and

I’m not sure what “another chain as having accepted it too late” is meant to mean, however maybe it could be rewritten as “submitted into a chain and then having accepted it too late at the time of another chain”.

In figure 5:

After withdrawal, attack fabricates a conflicting chain

After withdrawal, the attacker(s) (the supermajority validators) fabricates a conflicting chain

because the communityvalidators

because the community validators

thus stopping the attack and removingslashing the attacker.

thus stopping the attack and removing/slashing the attacker supermajority.

PoS is because if the validators are split into two subsets

PoS is valid because…

(however, every validator will lose a large portion of their deposit on one of the two chains due to leaks. If this situation happens, then each validator should simply favor whatever finalized checkpoint it saw first.

(however, every validator will lose a large portion of their deposit on one of the two chains due to leaks. If this situation happens, then each validator should simply favor whatever finalized checkpoint it saw first.)

can detect obviously malfesant behavior

malfeasant

Casper is an PoS-based

Casper is a PoS-based

[6] Bitcoin: A peer-to-peer electronic cash systems

system

Change/add to the following bib refs like so (I can only post at most two links):


As for security and performance, no feedback has come to mind yet, other than:

I look forward to future developments and shall be happy to actively work on such developments.

I guess that further research for other possible attacks and seeking out a security consultant would be good.


#29

Minor formatting typo in introduction:

“work (PoW) ``mining’ and enable” needs the double quote closing ‘mining’.

Should be “work (PoW) ``mining’’ and enable”


#30

Initial Section 3 and General Figure Comments -

Initial Section 3 Comments -

  • Develop figure illustrating the relationship among blocks, dynasties, validators and validator sets.

  • Mention that each dynasty has a validator set V(d) early in the section, maybe by changing “…we define the dynasty of a block.” to “…we define the dynasty of a block and two validator sets of a dynasty.”

  • Change “…will join the validator set…” to “…will join the validator set V(d)…”

  • I start to get lost in the “finalization” definition restriction section.

    -As a first step, I feel like defining “recursively justifying” would help.
    -The h(c’) * 100 kind of hits me out of nowhere. Why 100, because 100 blocks are required to define a checkpoint?
    -Define what “included in the block chain” means.

General Figure Comment -

For all but Figure 2, I’d suggest inverting the figure orientation. Start with the root at the top, then show links propagating beneath the root.


#31

From the proof of theorem 1:

The statement

Let j be the lowest integer such that h(b_j) > h(a_{m+1}); then h(b_{j-1}) < h(a_m)

could use some more justification. From the statement itself I see h(b_{j-1}) < h(a_{m+1}) because h(b_{j-1}) != h(a_m) by rule I and h(b_{j-1}) !> h(a_m) by the choice of j. I see nothing that would prevent interlacing tho. Suppose

h(a_m) = 3
h(b_{j-1}) = 4
h(a_{m+1}) = 5
h(b_j) = 6

Figure 3 makes it clear the heights h(1,2,3,…) do not have to be sequential integers or even all 0 mod the same integer.


#32

I think he means jumping checkpoints. The link definition doesn’t appear to prevent links that skip other checkpoints, not blocks. So in Figure three, imagine a pink line from directly from b_2 to b_4, skipping b_3.


#33

From the definition of finality, it’s known that h(a_{m+1}) = h(a_m) + 1. This is the crucial part that makes interlacing impossible.


#34

That’s what I was missing. Thank you. It might help to put that statement in the proof.


#35

Latest paper.

https://arxiv.org/abs/1710.09437


#36

This paper would be a good candidate for http://ledgerjournal.org, wouldn’t it?


#37

This is not a sentence:

If the validators are split into two subsets, with subset V_A voting on chain A and subset V_B voting on chain B.

Other suggestions that I made before haven’t been changed, e.g.:

Casper is an PoS-based

Use “a” if the noun starts with a consonant sound. Use “an” if the noun starts with a vowel sound.


#38

My error. Thank you! Caught them.


#39

“Inactivity leak” seems risky. It creates a possibility to purposefully DoS good validators so that only bad validators remain. It is not clear to me how mathematically sound is this.

Could you point to any paper in the area of BFT protocols that has a notion of these “inactivity leaks”? I am not aware of any protocol that removes nodes from the vote just because they have not been seen for a long time.


#40

Same feedback as I have had with all of the papers out of Ethereum research, but with more strength this time because this paper is meant to be “lighter” than others.

Use words instead of single roman/greek letters!

For an introductory paper like this, readers should be able to parse the paper in a single pass and in a single sitting without having to have a significant amount of external context. We can safely assume that the reader speaks English (or whatever language it is translated into) which means the reader already has an internal mapping of English words to concepts in their long-term memory. What the reader does not have is an internal mapping of k to checkpoint_height or an internal mapping for s to checkpoint_hash. When you use a bunch of single letter variable names it requires the reader to load the mapping of those variables to something in long term memory (e.g., a word/phrase) into working memory which is incredible small for humans (studies suggest ~7 registers).

When a user is trying to read a paper with many such variables they have two options, one is to try to retain all of the new variables in working memory while they read the paper, which doesn’t leave much space available for actual processing of the text or they can drop the variables from working memory and perform a visual lookup every time they encounter the variable. In both of these scenarios the reader is moving through the paper much more slowly and/or getting less out of the paper than if the author would just have used pre-existing mappings (English words/phrases).

As an author, it is easy to fall into the trap of believing that, “its easy to grok” because you are not trying to read the paper from start to finish in a single sitting for the first time with no context. After spending time on a paper like this, with sleep in-between sessions, your brain will move the single-letter variables mapped to concepts into long-term memory where the mapping is just as fast to lookup as an English word/phrase. This is fine if the purpose of your paper is to provide a central document for a very small audience to rally around and the readers will not be single-pass readers.

Since this is meant to be the “Casper for Dummies” paper I strongly believe that it should be targeted at a broader audience with an intent that it is consumable in a single sitting and in a single pass by readers, which means leveraging existing mappings in the readers brain rather than trying to get them to create new ones just to finish the paper.

Side note: If you insist on single letter variable names, don’t use Greek letters because unless the reader can read Greek, the letters are just obscure un-named glyphs. This is orders of magnitude harder for the average human to keep in working memory because they cannot assign a word to it, and they have to perform image recall and matching which is significantly harder for humans to do than spoken word matching.

As an exercise for the author(s), do a find/replace in the paper of all variables with Korean/Chinese/Japanese letters (pick a language you do not speak) and then try to read the paper again without taking time to memorize the mappings ahead of time. This is what reading the paper for the first time is like for everyone else, except for everyone else! (it is actually much worse for the rest of us because we don’t know in advance what we are reading, which means we are trying to grok at the same time)


#41

Just below Figure 1 there are a series of bullet points and they reference c'. However, from hat I can tell c' has not been defined yet. This is particularly confusing because the last two bullets reference c' -> c and c -> c' yet up to this point in the paper my understanding was that checkpoint links were unidirectional, not bidirectional. Also, why the change from a and b to c? It feels like the variables in this section are not particularly consistent, as in other places they are b with subscript numbers, (rather than c/c' or a/b).

I recommend picking one nomenclature for referring to checkpoints and sticking with it. a-z works, or b with subscript works. c' is a bit meh because you end up having to count the ticks if you have more than 2.

It looks like s and t are also used in the following sections, further adding to the confusion.


#42

Will you be publishing updated versions to incorporate additional feedback? Or should we consider the published version final and closed for comments?


#43

Still accepting edits. But at DEVCON so slow reaction time.