Data Consistency in Distributed Systems
When designing any distributed software system, one should ask themselves the question "Can my system handle eventual consistency, or is strong consistency necessary?"
How can you figure out which is right for your system? Also, why does it matter?
Strong Consistency vs. Eventual Consistency
Strong Consistency - The guarantee that after a write to some data, any following reads to the same data will reflect what was written.
Eventual Consistency - It may take some time for a write to a system to be reflected in subsequent reads.
Why Does it Matter?
Depending on whether you use a strongly consistent or eventually consistent database, different consequences may occur. Here are two examples:
In example 1 we can see that there are much more serious consequences with using an eventually consistent system than example 2.
CAP Theorem
In software systems, CAP Theorem discusses the tradeoffs between 2 properties:
Recommended by LinkedIn
The idea is that any given system must optimize for either consistency or availability as they are at odds with each other.
Note that there is a third property (where the "P" comes from), which is Partition Tolerance. That is typically a given in distributed systems - the tradeoff is just between consistency and availability.
Example Systems
Relational databases such as PostgreSQL can be configured to meet strong consistency requirements.
DynamoDB is optimized for availability over consistency, although it does have configurable consistency options.
Key Takeaways