The CAP theorem
Before stating the CAP theorem, let's try to understand consistency, availability, and partition tolerance using a real-world problem.
As a married person, I know how pathetic a person's life can become if they forget important dates like the birthday and anniversary of their spouse (in most cases, the husband is the culprit, but that is a separate discussion). One of my friends, let's call him Kaushik, saw an opportunity in this and opened up a start-up, HappySpouse.com, to address this issue. During a typical business day, Kaushik (K) and his customer (C) would have the following conversation:
K: Hello from HappySpouse.com. How may I help you?
C: Hey, I want you to remember my wife's birthday.
K: Sure! When is it?
C: September 3.
K: (Writing it down on a page exclusive for C.) Stored. Call me any time to remind you of your spouse's birthday again!
C: Thank you!
K: No problem! Your credit card has been charged with $0.05.
Kaushik's idea was so simple, needed nothing but a notebook and phone, yet so effective that it rolled off like an avalanche. VC firms started pouring in funds. He also started getting hundreds of calls every day. That's where the problem started. Now, more and more of his customers had to wait in the queue to speak to him. Most of them even hung up, tired of the waiting tone. Besides, when he was sick for a day and could not come to work, he lost a whole day of business. Not to mention all those unsatisfied customers who wanted information on that day. So, Kaushik decided to scale up and bring in his wife to help him.
He started with a simple plan to solve his availability to customers:
- He and his wife both got an extension phone
- Customers still dialed the same number
- A PBX routed the customer calls to whoever was free at that moment
A few weeks went by smoothly. One fine morning, he got a call from one of his old customers, Joey (J):
J: Hello, am I speaking to Kaushik from HappySpouse.com?
K: Hi Joey, great you remembered us. What can I do for you?
J: Can you tell me when our anniversary was?
K: Sure. 1 sec, Joey (looking up in his notebook, there was no entry on Joey's page). Joey, I have only your spouse's birthday here.
J: Holy cow! I just called you guys yesterday! (Cuts the call!)
How did that happen? Was Joey lying? Kaushik thought about it for a second and the reason hit him! Yesterday, did Joey's call reach his wife? He goes to his wife's desk and checks her notebook. Sure enough, it's there. He tells this to his wife and she realizes the problem too. What a terrible flaw in this distributed setup! This setup was not consistent!
Now, they decided that whenever either of them got a call to note, they would update each other's notebook. In that way, they would both have the same up-to-date information. Even if one of them was offwork, the other would email the updates so that the person could come the next day and jot down the updates. That way, they would be both consistent and available.
However, fate has its own plans. Due to this hectic schedule, Kaushik himself forgot his wife's birthday. Now his wife was angry with him and would not share any updates, creating a partition. To patch things up, Kaushik had to make himself unavailable to clients and make up to his wife.
Let's look at the CAP theorem now. It states that when we are designing a distributed system, we cannot achieve all three of consistency, availability, and partition tolerance. We can pick only two of CAP and sacrifice the third, that is CA, AP, or CP, where:
- Consistency: Once a customer updates information with HappySpouse.com, they will always get the most up-to-date information when they call subsequently, no matter how quickly they call back
- Availability: HappySpouse.com will always be available for calls as long as any one of them (Kaushik or his wife) reports to work
- Partition tolerance: HappySpouse.com will work even if there is a communication gap/couple-fight between Kaushik and his wife!