Sharding phase 1 spec (RETIRED)

jannikluhn · March 16, 2018, 10:27am

Failed to tear it apart, it’s too solid! Just some nitpicks:

validator -> collator

Should the shard ID contain (part of) the SMC address? It’s maybe a bit far-fetched, but one could imagine multiple SMCs on the same chain, and in this case the current ID wouldn’t be unique. Mainly worried about some kind of collation replay/slashing, although I don’t think that’s possible with the current spec.

I’m not a native English speaker, but I’m wondering if the plural here is correct (same for “transactions” and “receipts root” in current Ethereum, by the way). After all, it’s “apple tree” and not “apples tree”. Some alternatives that come to mind are be “body root/tree” and/or “chunk tree” (“chunk root” doesn’t seem to work either).

jamesray1 · March 16, 2018, 11:30am

Yes, I noticed this too, I think so.

jamesray1 · March 16, 2018, 11:37am

Yes, it probably makes the most sense to have chunk tree, chunk root, receipt root, etc. The alternative would be that the ownership apostrophe is omitted, i.e. chunks’ tree, chunks’ root, receipts’ root, etc. But it is counter-intuitive for the constituent parts to own the whole tree or a root, so chunk tree, chunk root, receipt root, etc. are preferred.

JustinDrake · March 16, 2018, 4:49pm

Collators are not full nodes. They are somewhat like miners in Ethereum 1.0. Full nodes remain unrewarded with sharding.

The entry ticket for a proposer on a single shard is PROPOSER_DEPOSIT + MIN_PROPOSER_BALANCE which is currently 1.1 ETH.

Too many collators would not reduce throughput (it would allow for collators to secure more shards, i.e. more throughput). What do you mean increase latency? The main reason for having a high deposit is overhead in to main chain from the SMC.

We will use blockhashes in phase 1 as described in get_eligible_collator. This is imperfect (because of grinding opportunities by miners) but good enough for now. You can find some code for the old spec here.

It’s just a simple keccak hash on the concatenation of the inputs. Both get_eligible_collator and compute_header_hash are helper functions added for clarity of exposition. They can be made private.

Well spotted I fixed it to proposer_registry[proposer_address].balances[shard_id].

Well spotted, fixed

If we need a new SMC then the NETWORK_ID (part of the shard ID) will be updated. I have added SMC_ADDRESS to the list of parameters.

I’m not a native English speaker either but I think you may be right. I’ve followed your suggestion.

jamesray1 · March 16, 2018, 9:49pm

OK thanks for clarifying. I have read the old spec but thought there may be differences in the latest spec.

For latency, I meant the time to gossip a collation to the rest of the network. But more collators does not necessarily mean increased latency.

I edited my initial comment to add:

jamesray1 · March 17, 2018, 6:05am

What differences do you think the phase 2 EVM will be to Py-EVM and Serenity? I am reading through History, state, and asynchronous accumulators in the stateless model and am getting at least part of an answer.

mhchia · March 18, 2018, 5:23am

Just want to confirm, is the collator selected from the collator registry or collator pool?

JustinDrake · March 18, 2018, 10:49am

There will a be bunch misc significant differences (some of which I listed as candidates in the phase 2 roadmap, including asynchronous cross-contract calls only, account abstraction, eWASM, archive accumulators, storage rent). I expect these differences to add up to a totally different EVM implementation, although a lot of the differences may abstracted away in the higher level programming languages.

That should be collator pool, thanks! Fixed

cspannos · March 18, 2018, 2:45pm

This may be silly, but, should/could this be the signature of a proposer at a given address in a proposal? Or is that overkill? I wonder because proposer address is spelled out in more detail and it made me think that this term could be more complete. But this may be unnecessary.

mhchia · March 19, 2018, 3:27am

Is the last proposal_commitment_slashing duplicate?

kladkogex · March 19, 2018, 10:24am

I think it is a very good start! Now the devil will be in the details

Cryptoeconomic proofs will be a nontrivial part since the current version Truebit protocol is only weakly secure !

Also imho protecting aganst frontrunning attacks will be a big issue across all components of the system.

skilesare · March 20, 2018, 7:41pm

Please help with what I’m missing here, but I don’t understand Phase 1. What good are collations without state transitions? What are you collating? Why would anyone submit a Blob if that blob isn’t going to actually do anything?

JustinDrake · March 20, 2018, 8:56pm

I’m not sure I understand. Do you mean that the proposer address could be part of the proposal body as opposed to the proposal header?

Fxed

Phase 1 has no TrueBit protocol. Do you mean the phase 3 execution game with executors? If so, why is weakly secure?

We’re reducing consensus to its core: data availability. Execution (transactions, state, state transitions, validity, state roots, …) is a simpler deterministic execution game (not a consensus game). The default execution engine (the sharded EVM) will be added in phase 2, and will provide “meaning” to blobs that pay gas fees to use the EVM.

Alternative execution engines are possible, and this is facilitated by the natural evolution of consensus abstraction. Taking a historical perspective:

Bitcoin (think “ASIC”)
- Enshrined-in-consensus dApp
- Account abstraction
Ethereum 1.0 (think “CPU”)
- Enshrined-in-consensus dApp engine
- dApp abstraction
Ethereum 2.0 (think “FPGA”)
- Enshrined-in-consensus data availability
- dApp engine abstraction

kladkogex · March 22, 2018, 9:16am

I think I worded myself wrong … It may be super strongly secure What I meant is we need a document that analyzes possibilities of front running attacks throughout the system.

If some agent is paid for some work, there is always a possibility for someone else steal the submission and get paid. Since there is lots of money potentially involved, in my opinion every single potential vulnerability needs to be analyzed.

IMHO Ethereum needs to stand out and be different by tightly addressing security. There are many projects on the market that imho will crash and fail by not addressing security seriously enough.

For the execution game with executors

“Here is one simple proposal. Allow anyone with ETH in any shard to deposit their ETH (with a 4 month lockup period), and at certain points (eg. once every Casper epoch) give depositors the ability to make claims about the state at some given height. These claims can be published into the blockchain. The claims would be of the form [height, shard, state_root, signature]. From the point of view of a node executing the state, a correct claim is given some reward proportional to the deposit (eg. corresponding to an interest rate of 5%), and a false claim means the claimer is penalized.”

Is this vulnerable to front running? Judging from the description it could be, since I could do no work, and simply intercept and resubmit someone else’s claim …

This is a particular example, though. A more general question though is should we have a separate document listing all potential vulnerabilities/weaknesses of the protocols and their impact from low to high.

vbuterin · March 22, 2018, 2:51pm

Yes, but if you front-run by copying another executor then you’re exposing yourself to a griefing attack from that executor voluntarily burning some portion of their deposit to burn yours. Though I do agree that these kinds of issues absolutely need to be fully considered.

jamesray1 · March 23, 2018, 4:40am

I need to figure out how to implement this in Rust, where a struct accepts an address as an argument. Cross-posting this at https://gitter.im/Drops-of-Diamond/Lobby?at=5ab485e45f188ccc15e8c909, Rust-specific discussion can be had there.

jamesray1 · March 23, 2018, 8:29am

AIUI this is saying that address is an int128 type, but in current implementations it is a binary data of length 160 bits.

jamesray1 · March 23, 2018, 8:31am

I’ll just define this like so for now:

	struct CollatorPool {
		collator_pool_len: int128,
			// size of the collator pool
		collator_pool: [Address; collator_pool_len], 
			// array of active collator addresses
		empty_slots_stack_depth: int128,
		empty_slots_stack: [int128; empty_slots_stack_depth],	
			// stack of empty collator slot indices
		empty_slots_stack_top: int128,		// top index of the stack
	}

What is the depth of empty_slots_stack? Is it 1024 words = 1024 *32 bytes like the current stack depth?

Or should empty_slots_stack: int128[int128] actually be written in Rust like empty_slots_stack: HashMap<H128, H128>?

jamesray1 · March 23, 2018, 9:06am

~~Why make this an arbitrary byte array? Is that for abstraction reasons for custom signature schemes like Lamport signatures, ECDSA, etc.?~~ This is actually meant to be a fixed byte array but the syntax is wrong, see my comment below.

efynn · March 23, 2018, 9:37am

The int128 is the collator’s id in the array, not the address itself.