Sharding was first used in databases and is a method for distributing data across multiple machines. This scaling technique can be used in blockchains to partition states and transaction processing, so that each node would process only a fraction of all transactions in parallel with other nodes. As long as there is a sufficient number of nodes verifying each transaction so that the system maintains high reliability and security, then splitting a blockchain into shards will allow it to process many transactions in parallel, and thus greatly improving transaction throughput and efficiency. Sharding promises to increase the throughput as the validator network expands, a property that is referred to as horizontal scaling.
We emphasize the three main types of sharding: network sharding, transaction sharding and state sharding. Network sharding handles the way the nodes are grouped into shards and can be used to optimize communication, as message propagation inside a shard can be done much faster than propagation to the entire network. This is the first challenge in every sharding approach and the mechanism that maps nodes to shards has to take into consideration the possible attacks from an attacker that gains control over a specific shard.
Transaction sharding handles the way the transactions are mapped to the shards where they will be processed. In an account-based system, the transactions could be assigned to shards based on the sender's address.
State sharding is the most challenging approach. In contrast to the previously described sharding mechanisms, where all nodes store the entire state, in state-sharded blockchains, each shard maintains only a portion of the state. Every transaction handling accounts that are in different shards, would need to exchange messages and update states in different shards. In order to increase resiliency to malicious attacks, the nodes in the shards have to be reshuffled from time to time. However, moving nodes between shards introduces synchronization overheads, that is, the time taken for the newly added nodes to download the latest state. Thus, it is imperative that only a subset of all nodes should be redistributed during each epoch, to prevent down times during the synchronization process.
Some sharding proposals attempt to only shard transactions or only shard state, which increases transaction's throughput, either by forcing every node to store lots of state data or to be a supercomputer. Still, more recently, at least one claim has been made about successfully performing both transaction and state sharding, without compromising on storage or processing power.
But sharding introduces some new challenges like: single-shard takeover attack, cross-shard communication, data availability and the need of an abstraction layer that hides the shards. However, conditional on the fact that the above problems are addressed correctly, state sharding brings considerable overall improvements: transaction throughput will increase significantly due to parallel transaction processing and transaction fees will be considerably reduced. Two main criteria widely considered to be obstacles transforming into advantages and incentives for mainstream adoption of the blockchain technology.
While dealing with the complexity of combining network, transaction and state sharding, Elrond's approach was designed with the following goals in mind:
Scalability without affecting availability: Increasing or decreasing the number of shards should affect a negligibly small vicinity of nodes without causing down-times, or minimizing them while updating states;
Dispatching and instant traceability: Finding out the destination shard of a transaction should be deterministic, trivial to calculate, eliminating the need for communication rounds;
Efficiency and adaptability: The shards should be as balanced as possible at any given time.
An illustration of Elrond’s Adaptive State Sharding approach is depicted in the figure below: