On Distributed Consistency - Part 3 - Network Partitions

Apr 8 • Posted 4 years ago

See also:

It’s fascinating that the formal theorem statement for CAP, in the first proof (that I know of), doesn’t use the word partition!

Theorem 1 It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties:
• Availability
• Atomic consistency in all fair executions (including those in which messages are lost).

That said, let’s talk about partitions, as “messages lost…in the asynchronous network model” is directly analogous.

Let’s look at an example:

In our diagram above, the network is partitioned.  The left and right halves (perhaps these correspond say to two continents) cannot communicate at all.  Four clients and four data server nodes are shown in the diagram.  So what are our options?

  1. Deny all writes.  If we deny all writes when the network is partitioned, we can still read fully consistent data on both sides.  So this is one option.  We give up write availability, and keep consistency.
  2. Allow writes on one side.  Via some sort of consensus mechanism, we could let one side of the partition “win” and have a master (as shown by the “M” in the diagram).  In this case, reads and writes could occur on that side.  On the other non-master partitions, we could either (a) be strict and allow no operations, or (b) allow eventually consistent reads, but no writes.  So in this situation we have full consistency in one partition, and partial operation in all others.
  3. Allow reads and writes in all partitions.  Here, we keep availability, but we must sacrifice strong consistency.  One partition will not see the operations and state from the other until the network is restored.  Once restored, we will need to a method to merge operations that occurred while disconnected.

A mitigation technique also comes to mind.  Suppose a particular client C has a much higher probability of needing an entity X than other clients.  If we store the master copy of X on a server close to C, we increase the probability that C can read and write X in option (2) above.  Let’s call this “intelligent homing”.  A real world example of this would be to “store master copies of data for east coast users on servers on the east coast”.  Intelligent homing doesn’t solve our problems, but would likely significantly decrease their frequency — that’s good, we just want more nines anyway.

Hopefully the above is a good informal “proof” of CAP.  It really is pretty simple.

Trivial Network Partitions

Many common network partitions are what we might term trivial.  Let’s consider from the perspective of option (2) above. We define a trivial network partition is one such that on all non-master partitions, there are either

  • no live clients at all, or
  • no servers at all

For example, if we have many data centers and our clients are Internet web browsers, and one of our data centers goes completely dark (and we have more left), that is a trivial network partition (we assume here that we can fail over master status in such a situation).  Likewise, losing a single rack in its entirety is often a trivial network partition.

In these situations, we can still be consistent and available.  (Well, for the partitioned client, we are unavailable, but that is of course a certainty if it cannot reach any servers anywhere.)