Hashgraph Consensus Timing Vulnerability


#1

There is now a lot of interest in “Hashgraph-the-Blockchain-Killer” in the Silicon Valley investment community, so I have been asked recently to do a review of the Hashgraph whitepaper. I am publishing it here because it may be interesting for some people

My humble opinion after reading Hashgraph whitepaper is that every page potentially contains vulnerabilities, since they have proofs that are not really proofs, theorems that are not theorems and lemmas which are not lemmas…Every object is vaguely defined, so it becomes hard to analyze.

Hashgraph is a directed acyclic graph (similar conceptually to IOTA I guess). Then they need to somehow transform a DAG into a linearly ordered chain, at this moment they are using a terribly flawed time-stamp-based “consensus” as explained, below.

If you go to page 11 of the Hashgraph whitepaper, the paragraph that starts with

Then, the received time is calculated. … The received time for x is the median of all
such timestamps …

you see that the way they go from a graph to a chain is by taking the median received time T_{median} for each transaction and then ordering transactions according to T_{median}.

T_{median} is a median of the reported received time for each transaction. Since the network is asynchronous and transactions are gossiped different nodes end up receiving transactions at different times.

This is imho the weakest point of their whitepaper for the following reason: if I am a bad guy, I can withhold reporting for a while, wait until the chain is settled and smart contract is executed, and a smartcontract and then report a transaction screwing the entire system.

Lets consider the following example where there are two transactions A and B, there are four good nodes and one malicious node.

The way an attack goes is as follows:

  1. The good guys report received time 1.3, 1.4, 1.5, 1.6 for the transaction A and 1.3, 1.42, 1.42, 1.7 for transaction B. The bad guy waits.

  2. T_{median} for A is 1.45 for A and 1.42 for B. So B gets included in the chain before A.

  3. A smart contract is executed on the chain.

  4. A bad guy comes a minute later, and reports 1.2 for A and 1.7 for B. Now
    the median time for A is 1.4 and median time for B is 1.42. So now A goes before B.

  5. Since a smart contract has already been executed under assumption that B goes before A , there is a contradiction. This logical contradiction causes an security harm to the chain beyond repair.

The example above is a simple scenario, one can cook up lots of more complex scenarios - basically using time stamps opens up a Pandora box. The only condition under which Hashgraph may be secure is if 100% of the nodes are honest.

Ironically, in 80s and 90s before the seminal paper from MIT (Barbara Liskov) came out, people tried all kinds of variations for time-stamp-based consensus systems and all of them failed.


#2

Calculating the received time for some transaction as “the median of all such timestamps” is not just as simple as having each node shoot out a number and then taking the median of these results. It requires nodes to gossip about gossip (at least in 2 rounds), which then allows nodes to detect if the ordering of certain events (and these events timestamps) to be finalized.

That is, if we can conclude that B before A is safe, then it is impossible for any set of non-byzantine nodes to change this (at least - if you trust the Hashgraph safety proof - which I think I do :stuck_out_tongue: ).

This video has a pretty good overview (an explanation for how timestamps are calculated, if you’re interested!


#3

Gossip-about-gossip will not help. Whatever the mechanism to select the guys that do timestamps is, one of the guys can turn out to be bad.

It is as simple as that. If you let some guys decide the ordering, whatever the filtering mechanism is, once the filtering is done, one of these guys can turn out to be bad.

This is exactly how the FLP theorem is proven the that a deterministic consensus is impossible in finite number of steps. Whatever the finite algorithm is, it will have the last step. And the node doing the last step may turn out to be bad. Moreover, filtering out this guy is mathematically impossible simply because all his actions except the last one can be good.


#4

I think a problem with hashgraph is that the 2/3 super-majority thing requires you to know who the nodes are, who is online etc. Not sure if that is currently possible in an open setting, like bitcoin but probably in permissioned block-chain’s it’s alright.

More research needs to be done.


#5

That being said, even though the final validator can affect the outcome, we can still give a bound on the possible values we will settle on for the timestamp before they commit to some timestamp.

For example, imagine we’ve seen the timestamps [1, 2, 3, 4, 5], and we are waiting for a final validator to add one more timestamp. Because we use the median to calculate the timestamp, we know the final timestamp will either be a 2, 4, or something in between these two.

But I’m not actually sure that’s what they do, so probably best to look into their specific algorithm for calculating these timestamps ASAP :stuck_out_tongue:


#6

The scenario described isn’t really a valid scenario given the mechanics of how a Hashgraph network operates. It’s important to note that the ability to determine consensus order for a round doesn’t require all nodes’ gossip - it requires that a supermajority of events decide a famous witness. Once a famous witness has been determined, then the events in that round can be ordered. In other words, a bad actor holding back gossip can’t block consensus indefinitely. It may affect the ordering of a transaction, however in order to “sneak in” an illicit transaction, it would have to convince enough nodes that the illicit transaction was received across the network before the real transaction. That would require collusion beyond the threshold of a byzantine fault tolerant system.

Another issue with the sequence of steps above is that the operation on the smart contract - I’ll use the term “business logic” since private hashgraph networks don’t have smart contracts per se - is executed at two different points in time. That won’t happen. Transaction processing happens at two points: First, when a node receives a transaction (e.g. pre-consensus) and again when a consensus order has been reached. It’s left up to the application to decide whether or not it needs to or wants to process pre-consensus transactions. Regardless, the results of those transactions are “for information only”, and modifications to the application state by pre-consensus transactions are thrown away. Hashgraph organizes consensus into rounds, in which a set of transactions have their order decided by the network. All nodes process the transactions in the set, in the consensus order. Either the business logic runs, or it doesn’t. Either the transaction has had a consensus order and timestamp assigned, or it hasn’t. If it hasn’t, nobody will process the transaction. If it has, then everybody will process the transaction. The subsequent gossip of the same transaction by the bad actor would be invalidated.

When nodes gossip transactions, receipts for that gossip are captured and disseminated through gossip - that’s where the term “gossip about gossip” comes from. So let’s say that Alice, Bob, Carol, and Dave are participating in a network, and Bob is a bad actor. A transaction comes in through Dave. Dave tells Alice and Bob. Alice tells Bob and Carol, and so on. When Dave gossips to Alice and Bob, he generates receipts that log that he told Alice and Bob about a transaction at a given time. Ditto for subsequent gossips. This is the information that actually makes up the Hashgraph. When Bob finally gossips, we’d in effect be saying “no, I didn’t hear about the transaction from Dave at timestamp X, I heard about it at timestamp Y”. This is detectable. All the nodes have a history of the gossip and it’s easy to analyze the history of the gossip for the transaction and identify that something funny happened with Bob’s gossip.

Hope this clarifies a bit. For sure it’s useful to have folks poking at and asking questions about these protocols. Certainly there are areas where every algorithm can be improved.


#7

This clarifies nothing ) Random collection of irrelevant thoughts is not a substitute for mathematics.

Craig - let me make things very very simple for you :slight_smile:

Let us have a bet for $10,000 - I will pay you $10,000 if you get three tenured professors from Computer Science /Electrical Engineering departments of the following universities to say in writing that “the vulnerability described in this thread is not a vulnerability and
the Hashgraph whitepaper is composed of sound and comprehensive mathematical definitions and proofs”

Here is the list of universities:

Stanford, MIT, Carnegie Mellon, UC Berkeley, Harvard, Princeton, Yale

Now if you dont get writings from these three profs by July 1, 2018, then you will have to pay me 1(one) US Dollar.

Do we have a deal Craig? :joy::joy::joy:


#8

I’m sure I can handle CMU, since you didn’t exclude Dr. Baird from the conditions of the bet :stuck_out_tongue:

In all seriousness, I’m a “practical computer scientist” and not a mathematician. I can only say that I know of nobody who has mathematically disproven the proofs offered in the white paper. Leaving the proofs aside, the mechanics described in the scenario don’t match the mechanics of the algorithm. It’s a scenario that can’t happen.

I’d point you at the in depth example in https://www.swirlds.com/downloads/SWIRLDS-TR-2016-02.pdf. A delayed gossip or a forged timestamp doesn’t prevent consensus and can’t unduly affect the ordering without the collusion of more than one node.

If you don’t accept the theoretical assertions in the paper and example, then that’s an area that I don’t have the sophistication to help with.


#9

Craig - frankly this is not about mathematics, this is about moral things.

Imho Hashgraph crosses the moral boundary in their interactions with the public and with people that invested in Hashgraph. The mathematical proofs as well as independent expert opinions should be on their website having the amount of the money they raised. This all seems to be very very troubling.


#10

I’m confused. There is no ICO or coin raise for Hashgraph. Companies have certainly been funded both traditionally and through ICO on much, much less.

The algorithm has been debated extensively in their Telegram and Discord channels. The proofs are public, yet I know of no credible anit-proof of proofs that the claims made are demonstrably false. I’d imagine most people would (rightly) view “independent” verification commissioned by Swirlds as suspect. There are plans to release the source for Hedera at the appropriate time for peer review.

I’ll let my previous comments stand on their own, and I’m happy to respond to questions or specific concerns but a philosophical debate is beyond what I’d consider productive discussion.

Signing off…