Consensus in distributed systems
Consensus is a problem that arises in distributed systems that are replicating a common state, such as data in a database. It is the task of getting all processes in a group to agree on some specific value based on the votes of each process. The consensus algorithm cannot just invent a value. All processes must agree upon the same value and it must be a value that was submitted by at least one of the processes.
In the previous example of HappySpouse.com, to prevent the consistency problem, we can have a run-around clerk, who will update the other notebook when one of the notebooks is updated. The greatest benefit of this is that he can work in the background, and an update doesn't have to wait for the other person to update. Formally speaking, in such distributed systems, one node updates itself locally, and a background process synchronizes all the other nodes accordingly. The only problem is that we will lose consistency for some time.
For example, a customer's call reaches Kaushik's wife first, and before the clerk has a chance to update his notebook, the customer calls back and it reaches him. Then, the customer won't get a consistent reply. So, we have to safely assume a customer won't forget things so quickly that he calls back in a few minutes in order for this eventually consistent solution to work.
Also, if we look back into the winning strategy of the Byzantine Generals' Problem, we see that a consensus among various captains needs to be achieved in order to distinguish a true message from a lie.
Later in this chapter, we will get back to this with the proof-of-work algorithm employed by bitcoin on a blockchain. As for now, it is good enough to be aware of two facts on consensus:
- Raft and Paxos algorithms were some early attempts to solve the consensus problem. Both Paxos and Raft managed to solve the consensus problem using majority votes in a cluster. They differed mostly by their focus. Raft aimed to provide a complete practical algorithm, whereas Paxos provided the building blocks of a consensus algorithm.
- There are two main ways of finding consensus in a distributed ledger system: the practical byzantine fault tolerance algorithm (PBFT) and algorithms for blockchains. Blockchain algorithms can be further classified into the proof-of-stake algorithm (PoS) and the proof-of-work algorithm (PoW). The PoS also has a special form called the delegated proof-of-stake algorithm (DPoS).