CAP Theorem

  1. CAP stands for Consistency Availability & Partition Tolerance.  
  2. What is availability? Any request sent to non-failed node should respond back with non error response.
  3. What is consistency? Every read request should respond back with most recent data or an error code.
  4. What is network partition tolerance? In the event of network partition, Ability of the system to continue to operate and serve the requests.
  5. What it means forfeiting the partition tolerance? As a result, the system’s behaviour will be undefined.
  6. What CAP theorem stats? In the event of network partition, A distributed database can achieve either consistency or availability. So, you will end up having CP or AP systems.
  7. It’s not possible to build CA systems. Simply because, in the event of network partition, Systems behaviour will be undefined. In such cases, we will not able to able to provide any guarantee on availability( remember any non-failed node should respond back with non error response).  
  8. How to achieve strong consistency?
  • In leader-follower replication model, Write requests will always be sent to leader and read requests can go to either leader or followers. If write requests are replicated to all it’s followers synchronously, then read requests served by any of the follower can provide strong read consistency. Alternatively, if write requests are replicated to all its follower asynchronously, then read requests served by any of the follower might return stale data resulting in eventual consistency model. Note: Synchronous replication can have a negative impact on latency (or response time) and availability.
  • In leaderless replication model, coordinator can propagate the write requests to all N nodes and expect W nodes to acknowledge so that the further read request can read data from R nodes to reconcile and return the final outcome. This model will produce strong consistency if W+R > N where N is number of nodes in the cluster. It’s called quorum operation.

9. The following are out of the scope for this theorem,

  • It considers network partition as the only failure. In reality, there can be more failures such as node failure, transient network failure etc.
  • It focuses on distributed database, not single node database.
  • It considers only 2 levels of consistency(strongly consistent & eventually consistent) while in reality there are various levels of consistency.


To view or add a comment, sign in

More articles by Veeramani Moorthy

  • NoSQL Toolbox
  • RDBMS - Salient points

    Though it’s possible to scale the reads using read replicas, there is a limit to this as you may not go beyond limited…

  • Nginx

    Nginx is an event driven, high throughput web server. It can be used as a Load balancer.

  • RabbitMQ

    It’s a multi protocol (AMQP, MQTT, STOMP) messaging server written in Erlang. Erlang provides actor based concurrency…

  • Elastic Search

    It’s a distributed, highly scalable, lucene based search engine. Elasticsearch index is sharded and each shard is a…

  • Kafka

    Kafka is a pub/sub system on top of distributed, replicated commit log. Kafka provides higher level abstraction called…

  • Pull vs Push

    Pull: In pull protocols, the client periodically connects to the server, checks for and gets (pulls) recent events and…

  • Load Balancer

    As the name suggests, It’s meant for balancing the load across all servers registered with it. There are 2 dimensions…

Others also viewed

Explore content categories