Apr 5 • Posted 4 years ago
- Part 1 - Introduction and CAP
- Part 3 - Network Partitions
- Part 4 - Multi Data Center
- Part 5 - Multi Writer Eventual Consistency
- Part 6 - Consistency Chart
In Part 1 we discussed C-class and A-class behaviors. For A-class, we need to weaken consistency constraints. This does not mean the system need be completely inconsistent, but it does mean we will need to relax the consistency model to some extent.
Amazon popularized the concept of “Eventual Consistency”. Their definition is:
the storage system guarantees that if no new updates are made to the object, eventually all accesses will return the last updated value.
This is not new, but it is great to have the concept formalized/popularized. A few examples of eventually consistent systems:
- DNS (mentioned in the above paper)
- Asynchronous master/slave replication on an RDBMS (also on MongoDB)
- memcached in front of mysql, caching reads
Many (not all) traditional examples that come to mind have eventually consistent reads, but a single writer (by “single writer”, we mean a data server, not the clients). Things get more interesting — and complex — with when there are many writers. Amazon Dynamo is an example of a “many writer eventually consistent” system. All of the above are perhaps “single writer eventually consistent”.
One other traditional technology worth noting is message queues. It has properties reminiscent of eventual consistency.
Forms of Consistency
Let’s look at a particular example. Consider a system using MongoDB in the following configuration:
"master", "slave", and "slave" could be mongod instances for example — or other databases with asynchronous replication. Clients randomly read from any slave for a given query, and always write to the master. Two slaves and two clients are shown, but let’s assume each of those scale out.
This sort of system we term “single writer eventual consistency”. So what are its properties? (1) A client could read stale data. (2) The client could see out-of-order write operations.
Let’s suppose we are storing some entity x in the datastore. Let’s assume entities have an initial value of zero. There are a series of writes to x by clients:
W(x=3), W(x=7), W(x=5)
Because the system is eventually consistent, if writes to x stop at some point, we know we will eventually read 5 — that is, R(x==5). However in the short term a client might for example see:
R(x==7), R(x==0), R(x==5), R(x==3)
(Note more nodes than 2 slaves are needed for this example behavior.)
So this is our weakest form of consistency - eventually consistent with out of order reads in the short term.
We can make this stronger. Consider the SourceForge mongodb configuration (larger diagram here). This configuration is eventually consistent, but we will not see the result of writes out of order. It provides monotonic read consistency.
One possible eventual consistency property is read-your-own-writes consistency, meaning a process is guaranteed to see the writes it has made when it does reads. This is a very useful property that makes programming easier. Note that neither of the above examples provide read-your-own-writes consistency. Also worth considering with this model is the definition of “your”. On a web application, that might be the user. If the system’s load balancer sends requests to different app servers, having read-your-own-write consistency for a single app server might not solve the real world consistency need.
EC Use Case Checklist
Thus when using eventual consistency, it is good for the architect to ask:
- can my use case tolerate stale reads?
- can it tolerate reading values out of order? if not, is my configuration monotonic read consistent?
- can it tolerate not reading my own writes? if not, is my configuration read-your-own-write consistent?