CAP theorem is a theorem which explains the trade of between the characteristics of distributed system which are Consistency, Availability and Partition tolerance.
Meaning of CAP
CAP stands for – Consistency, Availability and Partition tolerance.
Consistency– It means the data in all the nodes is consistent. And will return the fresh/latest data when requested.
Availability– It means when a request is made, the distributed system is guaranteed to give the response.
Partition Tolerance– It refers to the ability of distributed system to continue to work even in case of network or node failure.
Understanding CP, AP and CA
One has to choose two properties from the three.
CONSISTENCY + PARTITION TOLERANCE (CP):
Where the system will provide latest data when requested or give an error like- the Latest data can not be provided at the moment. [As AVAILABILITY is absent- client is not guaranteed to receive a response with data]
AVAILABILITY + PARTITION TOLERANCE (AP):
A query of client is always completed with data response. Even in case of network/node failure. [As CONSISTENCY is absent- it is not guaranteed that is it latest value, server might return stale data]
CONSISTENCY + AVAILABILITY (CA):
A query of client will always receive latest data from client. [This combination assumes that there will not be any network/node failure- which is considered as unrealistic scenario]
Real world examples:
CONSISTENCY + PARTITION TOLERANCE:
Let’s say there are two ATM machines, which has their separate synced databases with them. Assume that there is network partition between these two ATMs. and now if a customer tries to withdraw money from one ATM, it will not be respond. As there is partition and the ATM machines has chosen CONSISTANCY.
This set can be used where latest data is of utmost priority.
AVAILABILITY + PARTITION TOLERANCE:
Let’s say a creator on YouTube, who has millions of viewers uploads a video. Assume that there is network partition, So now when two different users open the video they will see different view and likes count on the video. The data will be received but it is not guaranteed if it is latest/updated data or stale data.
This set can be used where data availability is of utmost priority.
In the real world, no partition scenario is hard to achieve. Node or network can go down at one point or another due to various reasons.
Hence one is left with two choices – CP or AP
Recommended articles:
https://en.wikipedia.org/wiki/CAP_theorem
Recommended videos:
https://www.youtube.com/watch?v=BHqjEjzAicA