Hashgraph Consensus Timing Vulnerability

kladkogex · June 2, 2018, 9:00am

There is now a lot of interest in “Hashgraph-the-Blockchain-Killer” in the Silicon Valley investment community, so I have been asked recently to do a review of the Hashgraph whitepaper. I am publishing it here because it may be interesting for some people

My humble opinion after reading Hashgraph whitepaper is that every page potentially contains vulnerabilities, since they have proofs that are not really proofs, theorems that are not theorems and lemmas which are not lemmas…Every object is vaguely defined, so it becomes hard to analyze.

Hashgraph is a directed acyclic graph (similar conceptually to IOTA I guess). Then they need to somehow transform a DAG into a linearly ordered chain, at this moment they are using a terribly flawed time-stamp-based “consensus” as explained, below.

If you go to page 11 of the Hashgraph whitepaper, the paragraph that starts with

Then, the received time is calculated. … The received time for x is the median of all
such timestamps …

you see that the way they go from a graph to a chain is by taking the median received time T_{median} for each transaction and then ordering transactions according to T_{median}.

T_{median} is a median of the reported received time for each transaction. Since the network is asynchronous and transactions are gossiped different nodes end up receiving transactions at different times.

This is imho the weakest point of their whitepaper for the following reason: if I am a bad guy, I can withhold reporting for a while, wait until the chain is settled and smart contract is executed, and a smartcontract and then report a transaction screwing the entire system.

Lets consider the following example where there are two transactions A and B, there are four good nodes and one malicious node.

The way an attack goes is as follows:

The good guys report received time 1.3, 1.4, 1.5, 1.6 for the transaction A and 1.3, 1.42, 1.42, 1.7 for transaction B. The bad guy waits.
T_{median} for A is 1.45 for A and 1.42 for B. So B gets included in the chain before A.
A smart contract is executed on the chain.
A bad guy comes a minute later, and reports 1.2 for A and 1.7 for B. Now
the median time for A is 1.4 and median time for B is 1.42. So now A goes before B.
Since a smart contract has already been executed under assumption that B goes before A , there is a contradiction. This logical contradiction causes an security harm to the chain beyond repair.

The example above is a simple scenario, one can cook up lots of more complex scenarios - basically using time stamps opens up a Pandora box. The only condition under which Hashgraph may be secure is if 100% of the nodes are honest.

Ironically, in 80s and 90s before the seminal paper from MIT (Barbara Liskov) came out, people tried all kinds of variations for time-stamp-based consensus systems and all of them failed.

naterush · June 2, 2018, 11:02am

Calculating the received time for some transaction as “the median of all such timestamps” is not just as simple as having each node shoot out a number and then taking the median of these results. It requires nodes to gossip about gossip (at least in 2 rounds), which then allows nodes to detect if the ordering of certain events (and these events timestamps) to be finalized.

That is, if we can conclude that B before A is safe, then it is impossible for any set of non-byzantine nodes to change this (at least - if you trust the Hashgraph safety proof - which I think I do ).

This video has a pretty good overview (an explanation for how timestamps are calculated, if you’re interested!

kladkogex · June 2, 2018, 11:33am

Gossip-about-gossip will not help. Whatever the mechanism to select the guys that do timestamps is, one of the guys can turn out to be bad.

It is as simple as that. If you let some guys decide the ordering, whatever the filtering mechanism is, once the filtering is done, one of these guys can turn out to be bad.

This is exactly how the FLP theorem is proven the that a deterministic consensus is impossible in finite number of steps. Whatever the finite algorithm is, it will have the last step. And the node doing the last step may turn out to be bad. Moreover, filtering out this guy is mathematically impossible simply because all his actions except the last one can be good.

MaxC · June 2, 2018, 12:14pm

I think a problem with hashgraph is that the 2/3 super-majority thing requires you to know who the nodes are, who is online etc. Not sure if that is currently possible in an open setting, like bitcoin but probably in permissioned block-chain’s it’s alright.

More research needs to be done.

naterush · June 4, 2018, 4:40am

That being said, even though the final validator can affect the outcome, we can still give a bound on the possible values we will settle on for the timestamp before they commit to some timestamp.

For example, imagine we’ve seen the timestamps [1, 2, 3, 4, 5], and we are waiting for a final validator to add one more timestamp. Because we use the median to calculate the timestamp, we know the final timestamp will either be a 2, 4, or something in between these two.

But I’m not actually sure that’s what they do, so probably best to look into their specific algorithm for calculating these timestamps ASAP

craigdrabiktxmq · June 13, 2018, 5:05pm

The scenario described isn’t really a valid scenario given the mechanics of how a Hashgraph network operates. It’s important to note that the ability to determine consensus order for a round doesn’t require all nodes’ gossip - it requires that a supermajority of events decide a famous witness. Once a famous witness has been determined, then the events in that round can be ordered. In other words, a bad actor holding back gossip can’t block consensus indefinitely. It may affect the ordering of a transaction, however in order to “sneak in” an illicit transaction, it would have to convince enough nodes that the illicit transaction was received across the network before the real transaction. That would require collusion beyond the threshold of a byzantine fault tolerant system.

Another issue with the sequence of steps above is that the operation on the smart contract - I’ll use the term “business logic” since private hashgraph networks don’t have smart contracts per se - is executed at two different points in time. That won’t happen. Transaction processing happens at two points: First, when a node receives a transaction (e.g. pre-consensus) and again when a consensus order has been reached. It’s left up to the application to decide whether or not it needs to or wants to process pre-consensus transactions. Regardless, the results of those transactions are “for information only”, and modifications to the application state by pre-consensus transactions are thrown away. Hashgraph organizes consensus into rounds, in which a set of transactions have their order decided by the network. All nodes process the transactions in the set, in the consensus order. Either the business logic runs, or it doesn’t. Either the transaction has had a consensus order and timestamp assigned, or it hasn’t. If it hasn’t, nobody will process the transaction. If it has, then everybody will process the transaction. The subsequent gossip of the same transaction by the bad actor would be invalidated.

When nodes gossip transactions, receipts for that gossip are captured and disseminated through gossip - that’s where the term “gossip about gossip” comes from. So let’s say that Alice, Bob, Carol, and Dave are participating in a network, and Bob is a bad actor. A transaction comes in through Dave. Dave tells Alice and Bob. Alice tells Bob and Carol, and so on. When Dave gossips to Alice and Bob, he generates receipts that log that he told Alice and Bob about a transaction at a given time. Ditto for subsequent gossips. This is the information that actually makes up the Hashgraph. When Bob finally gossips, we’d in effect be saying “no, I didn’t hear about the transaction from Dave at timestamp X, I heard about it at timestamp Y”. This is detectable. All the nodes have a history of the gossip and it’s easy to analyze the history of the gossip for the transaction and identify that something funny happened with Bob’s gossip.

Hope this clarifies a bit. For sure it’s useful to have folks poking at and asking questions about these protocols. Certainly there are areas where every algorithm can be improved.

kladkogex · June 13, 2018, 5:55pm

This clarifies nothing ) Random collection of irrelevant thoughts is not a substitute for mathematics.

Craig - let me make things very very simple for you

Let us have a bet for $10,000 - I will pay you $10,000 if you get three tenured professors from Computer Science /Electrical Engineering departments of the following universities to say in writing that “the vulnerability described in this thread is not a vulnerability and
the Hashgraph whitepaper is composed of sound and comprehensive mathematical definitions and proofs”

Here is the list of universities:

Stanford, MIT, Carnegie Mellon, UC Berkeley, Harvard, Princeton, Yale

Now if you dont get writings from these three profs by July 1, 2018, then you will have to pay me 1(one) US Dollar.

Do we have a deal Craig?

craigdrabiktxmq · June 13, 2018, 6:07pm

I’m sure I can handle CMU, since you didn’t exclude Dr. Baird from the conditions of the bet

In all seriousness, I’m a “practical computer scientist” and not a mathematician. I can only say that I know of nobody who has mathematically disproven the proofs offered in the white paper. Leaving the proofs aside, the mechanics described in the scenario don’t match the mechanics of the algorithm. It’s a scenario that can’t happen.

I’d point you at the in depth example in https://www.swirlds.com/downloads/SWIRLDS-TR-2016-02.pdf. A delayed gossip or a forged timestamp doesn’t prevent consensus and can’t unduly affect the ordering without the collusion of more than one node.

If you don’t accept the theoretical assertions in the paper and example, then that’s an area that I don’t have the sophistication to help with.

kladkogex · June 13, 2018, 6:13pm

Craig - frankly this is not about mathematics, this is about moral things.

Imho Hashgraph crosses the moral boundary in their interactions with the public and with people that invested in Hashgraph. The mathematical proofs as well as independent expert opinions should be on their website having the amount of the money they raised. This all seems to be very very troubling.

craigdrabiktxmq · June 13, 2018, 6:22pm

I’m confused. There is no ICO or coin raise for Hashgraph. Companies have certainly been funded both traditionally and through ICO on much, much less.

The algorithm has been debated extensively in their Telegram and Discord channels. The proofs are public, yet I know of no credible anit-proof of proofs that the claims made are demonstrably false. I’d imagine most people would (rightly) view “independent” verification commissioned by Swirlds as suspect. There are plans to release the source for Hedera at the appropriate time for peer review.

I’ll let my previous comments stand on their own, and I’m happy to respond to questions or specific concerns but a philosophical debate is beyond what I’d consider productive discussion.

Signing off…

ahsannabi · December 5, 2018, 3:31pm

Hi, I am doing a PhD on vulnerabilities of blockchains from University of Derby and I can see how Kladko’s arguments are valid.

Median timestamps do not make hashgraph asynchronous, I wonder why the founder claimed that! Then there is evidence in FLP for the timing vulnerability. That’s why PBFT was introduced by Barbara Liskov, which makes sense.

Now for the median time example discussion, any information withheld by famous witnesses can later become an embarassment. The counterexample Craig gave was off the point because what if the transaction comes in through Bob, the bad actor? If he withholds the transaction for some time, we have a failed client!

Therefore, thanks for the point.

AdamGagol · December 6, 2018, 1:20pm

Ok, so I’m not a tenured professor from from any of the universities listed by Kladkogex, but I’m working on consensuses in one of the other DAG projects (alephzero.org), and I think that the argument entirely misses the point (and I can prove it). Why the argument is wrong:

The idea of using median timestamp as an ‘unbiased’ timestamp for an event is based on the fact that at the time when the median of the set of timestamps is calculated, there is no possibility of adding an additional element to this set by a dishonest actor. Hence, the attack proposed by Kladko doesn’t make sense. As we read in the Hashgraph whitepaper in the paragraph above the one quoted by Kladko:

First, the received round is calculated. Event x has a received round of r if that is the first round in which all the unique famous witnesses were descendants of it, and the fame of every witness is decided for rounds less than or equal to r. (emphasis added)

Then we read the whole paragraph partially quoted by Kladko:

Then, the received time is calculated. Suppose event x has a received round of r, and Alice created a unique famous witness y in round r. The algorithm finds z, the earliest self-ancestors of y that had learned of x. Let t be the timestamp that Alice put inside z when she created z. Then t can be considered the time at which Alice claims to have first learned of x. The received time for x is the median of all such timestamps, for all the creators of the unique famous witnesses in round r.

Summing up:

computation of a ‘time received’ for an event happens only when fame of all witnesses of its ‘round received’ is decided.
the ‘time received’ is a median of units which are uniquely defined by the set of the aforementioned famous witnesses. Hence, it is not possible that an additional timestamp withhold by a malicious actor will be included in the computation of this median at any time in the future.

Hashgraph’s white paper might not be the most well-written mathematics, but the arguments in the paper are correct.

Levalicious · December 6, 2018, 2:15pm

Blockquote at the time when the median of the set of timestamps is calculated

Doesn’t that mean the network must be synchronous then in order to be safe as such? I think that’s what Kladko is getting at- it can’t be asynchronous and safe the way it’s constructed.

kladkogex · December 6, 2018, 4:19pm

Well, now we have Ahsan, a PhD candidate.
And Hashgraph still did not present the three professors :-))
So looks like the game is being won by Hashgraph skeptics.

Adam - are you willing to take the bet ? I am willing to accept it 10:1.

If you do not present the proof in terms of the professors, (see the description above) , you pay me $1000. If not, I will pay you $10,000))

maiavictor · December 6, 2018, 4:38pm

As a side question, do we even have a “mathematical proof” that PoW solves the Byzantine General’s Problem probabilistically?

AdamGagol · December 6, 2018, 7:06pm

Levalicious - no, it doesn’t mean that the network must be synchronous. The median is calculated as soon as enough famous witnesses are revealed, no matter how long it takes, so it doesn’t require any assumptions regarding network delays. Hence - it is asynchronous.

kladkogex -I think that math is rather about exchanging arguments and providing proofs than flashing scientific degrees. And regarding arguments - I think that the ‘skeptics’ are loosing since no one is even trying to rebuke my argument. Having said so, I would really like to see more papers from the crypto to be peer-reviewed. However, I’m not willing to take the bet. As I said before - hashgraph is not a well written mathematics. Even though I believe that the proofs are correct, I don’t think that it could pass peer-review in this form.

maiavictor - it is hard to answer this question, since classically BFT is defined for a fixed set of actors, and PoW blockchain is nonpermissioned at its core. Here: https://eprint.iacr.org/2016/454.pdf the authors are showing that the Nakamoto consensus is safe in a way which you could call “probabilistically BFT” provided that the hardness of the puzzle is a function of the network delay (which can be seen as a soft form of synchronicity assumption).

kladkogex · December 10, 2018, 3:46pm

IMHO people learn a lot during graduate study, including moral standards. Some of them decide to break the standards later.

Some people do decide to take an alternative path and never go for graduate study.
IMHO it is totally possible to be as good a researcher, if one does not have a graduate degree.

The problem is when people break academic standards. If you have a proof that is not really a proof you do not call it a proof. If Hashgraph would want to develop a great algorithm and they would have a vision but no proofs they should have stated this clearly in their whitepaper.

The whitepaper in its current form is imho clearly misleading. Thats why imho it should be reviewed by three reasonably sane and independent CS faculty members :)) It is exactly the job of academia to resolve issues like that. In biotech it is normal to require expert opinions of academic scientists. The same should be true for blockchain IMHO.

miles2045 · December 11, 2018, 12:58am

Hi kladkogex,

You might find this helpful:

Cheers.

ahsannabi · December 15, 2018, 11:13pm

Thats too soon a media hype than a cited verifying paper of any mathematical proof. Rather, here is an IEEE paper citing the hashgraph vulnerability:
“The validity is dependent on the (directly or indirectly) outgoing edges of the transaction, which represents the nodes that have validated it. A scale-out throughput can be achieved if the acquirement of the complete graph is not obligated for all nodes. Then, the validity of the transactions is compromised due to its dependency on the validators”.
A Scale-out Blockchain for Value Transfer with Spontaneous Sharding, Zhijie Ren and four more authors. Link here: https://arxiv.org/pdf/1801.02531.pdf
Another way of simply putting it is what Kladko and I have just already pointed out. Bob withholds the transaction. What next? Becomes famous witness, turns nasty. Doesnt become famous witness, the client still cries out where is his product! Because Bob is not just an event! See the double naming of variables? I mean its hideous!
And also still left is my question of why timed medians are asynchronous!
I know every new tech product in the market engages tenure-track professors for verifying their claims, and this initially does create confirmation bias. But for the company it is just shooting themselves in the foot.
AdamGagol, not being hard on you but thats why I entered the academia from development. You never know the why of development because you are looking at it just too closely
Thanks Kladko for making my thesis sparkle!
Craig is on vacations
Sincerely,
Ahsan Nabi Khan

kladkogex · December 16, 2018, 8:06pm

Vow ) Yet another Alice in Wonderland news from Hashgraph ) Coq has nothing to do with proving that they can securely map what they got into a linear chain and run smart contracts.

Hashgraph needs to take its original whitepaper which has an incredible amount of unproven and shaky things, turn it into a mathematical paper and have it peer reviewed. Coq is not going to help with that imho.