2FA zk-rollups using SGX

TLDR: We suggest using SGX as a pragmatic hedge against zk-rollup SNARK vulnerabilities.

Thanks you for the feedback, some anonymous, to an early draft. Special thanks to the Flashbots and Puffer teams for their insights.

Construction

Require two state transition proofs to advance the on-chain zk-rollup state root:

  1. cryptographic proof: a SNARK
  2. 2FA: an additional SGX proof

SGX proofs are generated in two steps:

  1. one-time registration: An SGX program first generates an Ethereum (pubkey, privkey) pair. This keypair is tied to a specific SGX enclave, and only signs valid state transitions (pre_state_root, post_state_root, block_root). The pubkey is registered on-chain by verifying an SGX remote attestation which attests to the pubkey being correctly generated.
  2. proving: After registration the SGX program runs the rollup state transition function to sign messages of the form (pre_state_root, post_state_root, block_root). These SGX proofs are verified on-chain by checking signatures against the registered pubkey.

Context

Early zk-rollups are prone to SNARK soundness vulnerabilities from circuit or proof system bugs. This is concerning because:

  1. complexity: The engineering of zk-rollups is particularly complex. Even bridges, an order of magnitude less complex than rollups, are routinely hacked.
  2. value secured: The value secured by leading zk-rollups is expected to become significantly higher than that of today’s bridges. These large bounties may be a systemic risk for Ethereum.
  3. competition: The zk-rollup landscape is competitive, with first-mover advantages. This encourages zk-rollups to launch early, e.g. without multi-proofs. (See Vitalik’s presentation and slides on multi-proofs.)

Discussion

SGX 2FA is particularly attractive:

  • safety: There is no loss of safety to the zk-rollup—the additional requirement for SGX proofs is a strict safety improvement. Notice that the SGX enclaves do not handle application secrets (unlike, say, the Secret Network).
  • liveness: There is almost no loss of liveness. The registration step does require Intel to sign an SGX remote attestation but:
    • a) The specific SGX application Intel is providing a remote attestation for can be hidden from Intel. Intel would have to stop providing remote attestations for multiple customers to deny remote attestations for a targetted zk-rollup.
    • b) Hundreds of SGX enclaves can register a pubkey prior to the rollup launch. The currently registered pubkeys can generate SGX proofs even if Intel completely stops producing remote attestations for new registrations.
    • c) If required, rollup governance can remove the SGX 2FA.
  • gas efficiency: The gas overhead of verifying an SGX proof is minimal since only an Ethereum ECDSA signature is being verified on-chain (other than the one-time cost of verifying remote attestations).
  • latency: There is no additional proof latency—producing SGX proofs is faster than producing SNARKs. Notice that SGX 2FA provides little value to optimistic rollups which have multi-day settlement and can use governance to fix fraud proof vulnerabilities.
  • throughput: There is no loss of throughput. The Flashbots team has shown Geth can run at over 100M gas per second on a single SGX enclave. If necessary multiple SGX enclaves can work in parallel, with their proofs aggregated.
  • computational resources: The SGX computational resources can be minimal. When the state transition is run statelessly (e.g. see minigeth) there is no need for an encrypted disk and minimal encrypted RAM is required.
  • simplicity: The engineering of SGX is easy relative to SNARK engineering. Geth can be compiled for Gramine with an 11-line diff. The Puffer team is working on a Solidity verifier for SGX remote attestations, supported by an Ethereum Foundation grant.
  • auditability: Auditing the 2FA should be relatively straightforward. The SGX proof verification logic is contained and the incremental smart contract risk from introducing SGX 2FA is minimal.
  • flexibility: Enclaves from non-Intel vendors (e.g. AMD SEV) can replace or be used in parallel to SGX enclaves.
  • bootstrapping: 2FA can be used alone—without SNARK verification—to bootstrap an incremental rollup deployment. (This would be similar to Optimism launching without fraud proofs.)
  • upgradability: The SGX proof verification logic is upgraded similarly to the SNARK verification logic. Previously registered pubkeys are invalidated and the definition of what constitutes a valid pubkey is changed by upgrading the remote attestation verification logic.
  • deactivation: SGX 2FA is removable even without governance. For example, the 2FA could automatically deactivate after 1559 days.

There are also downsides to SGX 2FA:

  • memetics: SGX has a bad reputation, especially within the blockchain space. Association with the technology may be memetically suboptimal.
  • false sense of security: Easily-broken SGX 2FA (e.g. if the privkey is easily extractable) may provide a false sense of security.
  • novelty: No Ethereum application that verifies SGX remote attestations on-chain could be found.
13 Likes

Avoid dependency on SGX at all costs, it have been breached before, it will be breached in the future. As you mentioned, it has a very bad reputation in the space for obvious reasons

5 Likes

Following assumes that attacks against SGX can steal bits of private key, but only at rate of few bits per privkey access. And privkey is accessed only to produce signatures.

Consider rotating the keypair with every attestation. Add pubkey of the new keypair to the tuple signed on every attestation: (pre_state_root, post_state_root, block_root, new_pubkey). Smart contract will update 2FA key with new_pubkey.

This increases gas cost - one more SSTORE per state transition.

1 Like

If there was a critical failure of SGX (worst case scenario you can imagine, including Intel being actively malicious), what bad things would happen? Assuming the ZK stuff was not broken then would nothing bad happen at all? Is the bad scenario only when both SGX and ZK stuff are broken?

One disadvantage of this is that you don’t get the benefit of an escalating “bug bounty” over time as the system attracts capital. The set of attackers who can exploit a bug in the ZK stuff is limited to state actors, Intel, and maybe some hardware manufacturers involved in the chip production process. This means everything seems to be fine right up until it catastrophically fails and that could be a long way off. If the ZK is exposed directly, it is more likely to be attacked early before too much capital moves in.

3 Likes

Oh, great suggestion! :slight_smile:

There is no dependency on SGX—that’s the point. When SGX is breached safety falls back to a “vanilla” zk-rollup. SNARKs plus SGX is a strict safety improvement over just SNARKs.

Yes to both questions.

In addition to the “organic” bug bounty for simultaneously breaking both the SNARK and SGX, one could design a “synthetic” escalating bug bounty for breaking either the SNARK or SGX. The synthetic bounty could escalate by, say, 10 ETH per day.

The endgame for zk-rollups is that SNARKs are sufficient for security thanks to multi-proofs (see links in the post) and formal verification. You can think of SGX security-in-depth as a way to buy time to achieve this endgame and reduce the risk of ever failing catastrophically.

3 Likes

ORs often get criticized for this ability because arguably it means that users have to trust the governance mechanism first and foremost and the fraud proofs are not much more than decoration. Wouldn’t SGX 2FA be a great tool for ORs to minimize the power of governance? Emergency updates would only be allowed if there’s disagreement between the two factors. Other updates would require a notice period of about one month, giving everyone ample time to exit if they disagree.

2 Likes

Setup a 2FA Network/Layer: this should verify the 2FA proofs required by the protocol. The nodes could be run on low-cost hardware (e.g. Raspberry Pis) and could be distributed geographically to ensure redundancy and resilience.

Aaggregated Proof: Also, the protocol could use proof aggregation techniques to minimize the overhead of generating and verifying 2FA proofs. This involves grouping multiple proofs together and creating a single, aggregated proof that can be verified more efficiently. Certainly, this will need to be designed to secure and resist attack.

1 Like