Enforceable Human-Readable Transactions: how to solve Bybit-like hacks

2-way communication channel with at least one device

The reason we are showing the QR code is that we are nervous the original device is compromised. I’m not sure it makes sense to do our verification with the same device.

any online device that the offline hardware wallet is communicating with may be compromised

Agreed.

We should be targeting regular everyday users

Agreed, but we have a mismatch of expectations. I do not think showing a description that a user has to sign will fix the problem for them.

I do not think adding human-readable text to a transaction will actually solve much.
Decoded calldata is basically 90% of the way there (maybe even more since it’s so specific). The user just needs to understand what each parameter means, and it’s much less effort on the smart contract developers and the entire ecosystem. I think when we start getting into specific examples, we see where this will start to break down.

Example

This is a decoded supply transaction on the Aave contract.

supply(address asset, uint256 amount, address onBehalfOf, uint16 referralCode):
- asset: 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48
- amount: 1000000000000000000 [1e18]
- onBehalfOf: 0xF8Cade19b26a2B970F2dEF5eA9ECcF1bda3d1186
- referralCode: 0

And here is what it “probably” would be with the text description idea:

“You are depositing 1000000000000000000 [1e18] of asset 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48 onBehalfOf 0xF8Cade19b26a2B970F2dEF5eA9ECcF1bda3d1186. This will return you the aToken associated with 0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48, which will gain yield in your wallet.”

But the user has to also sign this, even though the decoded calldata tells you exactly this already? I don’t see how this is advantageous. The only nice part about this is the additional part of This will return you the aToken associated with.... However, a regular user should probably still refer to the documentation to understand everything that’s going on here. For example, a normal user will go:

  • “1000000000000000000 tokens! That’s too many!”

So they need to do some research to understand decimals in ERC20s.

  • 0xF8Cade19b26a2B970F2dEF5eA9ECcF1bda3d1186: yeah that looks like my address…

Now we are back to the user potentially falling victim to an address poisoning attack!

The users (IMO) will have the same questions with the “signed description” as the decoded calldata. A hardware wallet could very easily have a list of known ABIs (many already do) to decode the calldata for the user.

Bigger Example

Now image a user wants to do the same thing (supply a token to Aave) but through a safe wallet, and batch the approval first.

Full (example) calldata:

0x6a761202000000000000000000000000f220d3b4dfb23c4ade8c88e526c1353abacbc38f00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000140000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000034000000000000000000000000000000000000000000000000000000000000001c48d80ff0a00000000000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000000000000172005a7d6b2f92c77fad6ccabd7ee0624e64907eaf3e00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000044095ea7b300000000000000000000000078e30497a3c7527d953c6b1e3541b021a98ac43c000000000000000000000000000000000000000000000002b5e3af16b18800000078e30497a3c7527d953c6b1e3541b021a98ac43c00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000084617ba0370000000000000000000000005a7d6b2f92c77fad6ccabd7ee0624e64907eaf3e000000000000000000000000000000000000000000000002b5e3af16b18800000000000000000000000000009467919138e36f0252886519f34a0f8016ddb3a300000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000041000000000000000000000000f8cade19b26a2b970f2def5ea9eccf1bda3d118600000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000

Here is the (nested) decoded calldata. So the multiSend contract accepts “packed” bytes to save gas, and I have to decode the bytes sent to that contract differently than a normal decoder.

function execTransaction(
        address to,
        uint256 value,
        bytes data,
        Enum.Operation operation,
        uint256 safeTxGas,
        uint256 baseGas,
        uint256 gasPrice,
        address gasToken,
        address payable refundReceiver,
        bytes signatures
    )"
- to: 0xf220D3b4DFb23C4ade8C88E526C1353AbAcbC38F
- value: 0
- data: 
---- (decoded A): "multiSend(bytes)":
-------- (decoded B [TX 1]): 
------------ requiredValue: 0x00
------------ to: 5a7d6b2f92c77fad6ccabd7ee0624e64907eaf3e
------------ value: 0
------------ dataLength:  68
------------ (decoded C 1): "approve(address,uint256)":
---------------- spender: 0x78e30497a3c7527d953c6B1E3541b021A98Ac43c
---------------- amount: 50000000000000000000 [5e19]
-------- (decoded B [TX 2]): 
------------ requiredValue: 0x00
------------ to: 0x78e30497a3c7527d953c6b1e3541b021a98ac43c
------------ value: 0
------------ dataLength: 132
------------ (decoded C 2): "supply(address,uint256,address,uint16)"
---------------- asset: 0x5A7d6b2F92C77FAD6CCaBd7EE0624E64907Eaf3E
---------------- amount: 50000000000000000000 [5e19]
---------------- onBehalfOf: 0x9467919138E36f0252886519f34a0f8016dDb3a3
---------------- referralCode: 0
- operation: 1
- safeTxGas: 0
- baseGas: 0
- gasPrice: 0
- gasToken: 0x0000000000000000000000000000000000000000
- refundReceiver: 0x0000000000000000000000000000000000000000
- signatures: 0x000000000000000000000000f8cade19b26a2b970f2def5ea9eccf1bda3d1186000000000000000000000000000000000000000000000000000000000000000001

How do you possibly turn this into English while making it easier for someone to understand? And since there are nested calls between contracts, does the final description have to make a lot of view calls between contracts?

Here is what AI (Claude Sonnet 4) told me:

“”"
This is a DeFi lending transaction that performs two operations in sequence:

  1. Token Approval: The wallet first approves a lending protocol (contract 0x78e30497...) to spend 50 tokens (50,000,000,000,000,000,000 wei) of a specific ERC-20 token (contract 0x5a7d6b2f...).
  2. Supply/Deposit: The wallet then calls the lending protocol’s supply function to deposit those same 50 tokens into the protocol on behalf of address 0x9467919138....

In summary: This transaction is depositing 50 tokens into a DeFi lending protocol (likely Aave or a similar platform) to earn yield. The tokens will be supplied to the lending pool on behalf of the specified beneficiary address.

The transaction uses Safe’s multiSend functionality to batch these two operations together atomically - meaning both the approval and the supply happen in a single transaction, or neither happens at all.

The signature indicates this transaction was authorized by the Safe wallet owner at address 0xf8cade19....
“”"

IMO this is the easiest possible explanation that could be given (because an AI is understanding it on a wholeistic level), and it still is confusing. As transactions get bigger and bigger, these walls of text will get worse and worse. AND there are still a lot of “gotchas” here.

  1. The user isn’t informed of the refund address if the transaction fails, should we explain that?
  2. The user isn’t informed of the gas token, do we add that to the explainer
  3. call vs delegatecall isn’t explained

Each contract would have their own different explainer, making the “final” transaction very confusing. I don’t see a world where this helps. Personally, I think transactions as of today are technical in nature, and security people at least should be able to understand them before we even consider the masses.

If all hardware wallets will have LLMs in them in the future, then great, we should still have a way for security concisous people to have higher assurance of the transactions they are sending.

Summary

Arguments against English transactions enshrined in the smart contracts

Argument 1: Human eyes

Users don’t have a good way to extract data from their wallets, therefore, their human eyes are highly likely to mischeck something (an address, a 0 on a number, etc) even if we have an English description of the transaction.

Argument 2: Gas

Signing multiple transactions is a waste of gas, not to mention it puts a lot of extra effort to the smart contract developers.

Solutions

Big one

No matter how you look at it, as of today, we need hardware wallets to have a better way to extract the data from them.

Let’s walk before we run. Right now, security researchers have a hard time verifying information on hardware wallets. I know because most people are not getting perfect scores on my wise-signer test.

Solutions

Based on argument 1, no matter how you slice it, we need a way to extract data from a hardware device, or make verification easier. Also, users should probably check the documentation of the tools either way, because descriptions are often confusing. The two proposed solutions:

  1. A QR code with all the data (IMO this is the best option, then a user could easily do #2 here)
  2. A digest where all the data is hashed and a user can compare it to an expected hash.

In both solutions, wallets could decode them, or we could encourage users to go to the documentation.