Exploring Sharding for the Tezos Blockchain

Tutors: Mathias Bourgoin, Julien Tesson

Tezos is a blockchain developed with a focus on governance and security. To this end, Nomadic Labs contributes to an implementation of the Tezos blockchain written in OCaml, a strongly typed functional language. In addition, Nomadic Labs applies formal verification and testing.

Blockchains are decentralized ledgers managed using a distributed peer-to-peer network. Different entities can interact with/within this network: nodes that share a replicated state of the ledger, validators that handle the validity of operations and aggregate them into blocks, etc.

In classic blockchains, each node has to validate each newly received block, validating each operation applied on the chain ledger state and if necessary running each smart contract calls it may contain.

Depending on the popularity of the blockchain, it might mean long and costly computations that may be inducing a smaller operation throughput and higher fees.

A popular solution to this performance issue is to distribute computations between peers. This is commonly known as sharding in the blockchain ecosystem. Multiple kinds of sharding exist.1 They depend on what is distributed (e.g. storage vs. computation) and the way of implementing it. Sharding is still a cutting-edge exploratory field mixing several aspects of computer science including distributed systems and consensus algorithms in adversarial systems.

Goals

The goal of this internship is to experiment with the sharding on Tezos. In sharding, the blockchain is split into shards. Here, we will consider each shard as a sub network of the whole Tezos network.

A first step will be to distribute validation, keeping the shared replicated context between shards. In this implementation, block producers (that are called bakers in Tezos) will create blocks divided in domains, each domain being associated with a chunk of operations in the block. In this step, shard nodes will only validate the operations associated with their respective domains, ignoring/trusting other block parts.

A second step will be focused on the distribution of the context (the shared state of the ledger) by domains. This will lead to synchronization issues between shards in case of inter-domains operations.

A final step might imply exploring the solving of consensus issues where, in realistic systems, each shard cannot blindly trust other shards.

Requirements

You should already be familiar with OCaml. Knowledge of distributed systems would be a plus.

Internship Context

You will work at the Nomadic Labs' offices in Paris or Grenoble.

Participating in a large scale open-source project you will have to rapidly learn to use collaborative tools (Git, merge request, issues, gitlab, continuous integration, documentation) and to communicate about your work. The final results might be presented at an international conference or workshop.

You will have a designated advisor at Nomadic Labs and will have to work independently and to propose thoroughly-considered solutions to the different problems you will have to solve. You will be encouraged to seek advice from members of the team.

Intellectual Property

All material produced (essays, documentation, code, etc.) will be released under an open source license (e.g. MIT or CC).