[Bernstein09] 9.3. Synchronizing Updates to Replicated Data

Stefen 2010-06-24

展开全文

9.3. Synchronizing Updates to Replicated Data

One-Copy Serializability

On possible goal of replication is to have replicas behave functionally like nonreplicated servers. This goal can be stated precisely by the concept of one-copy serializability, which extends the concept of serializability to a system where multiple replicas are present. An execution is one-copy serializable if it has the same effect as a serial execution on a one-copy database. We would like a system to ensure that its executions are one-copy serializable. In such a system, the user is unaware that data is replicated.

In a system that produces serializable executions, what can go wrong that would cause it to violate one-copy serializability? The answer is simple, though perhaps not obvious: a transaction might read a copy of a data item, say x, that was not written by the last transaction that wrote other copies of x. For example, consider a system that has two copies of x, stored at locations A and B, denoted x_A and x_B. Suppose we express execution histories using the notation of Section 6.1, where r, w, and c represent read, write, and commit operations, respectively, and subscripts are transaction identifiers. Consider the following execution history:

H = r₁[x_A] w₁[x_A] w₁[x_B] c₁ r₂[x_B] w₂[x_B] c₂ r₃[x_A] w₃[x_A] w₃[x_B]c₁

This is a serial execution. Each transaction reads just one copy of x; since the copies are supposed to be identical, any copy will do. The only difficulty with it is that transaction T₂ did not write into copy x_A. This might have happened because copy x_A was unavailable when T₂ executed. Rather than delaying the execution of T₂ until after x_A recovered, the system allowed T₂ to finish and commit. Since we see r₃[x_A] executed after c₂, apparently x_A recovered before T₃ started. However, r₃[x_A] read a stale value of x_A, the one written by T₁, not T₂.

When x_A recovered, it should have been refreshed with the newly updated value of x that is stored in x_B. However, we do not see a write operation into x_A after T₂ committed and before r₃[x_A] executed. We therefore conclude that when r₃[x_A] executed, x_A still had the value that T₁ wrote.

Clearly, the behavior of H is not what we would expect in a one-copy database. In a one-copy database, T₃ would read the value of x written by T₂, not T₁. There is no other serial execution of T₁, T₂, and T₃ that has the same effect as H. Therefore, H does not have the same effect as any serial execution on a one-copy database. Thus, it is not one-copy serializable.

One obvious implication of one-copy serializability is that each transaction that writes into a data item x should write into all copies of x. However, when replication is used for improved availability, this isn’t always possible. The whole point is to be able to continue to operate even when some copies are unavailable. Therefore, the not-so-obvious implication of one-copy serializability is that each transaction that reads a data item x must read a copy of x that was written by the most recent transaction before it that wrote into any copy of x. This sometimes requires careful synchronization.

Still, during normal operation, each transaction’s updates should be applied to all replicas. There are two ways to arrange this: replicate update operations or replicate requests. In the first case, each request causes one transaction to execute. That transaction generates update operations, each of which is applied to all replicas. In the second case, the request message is sent to all replicas and causes a separate transaction to execute at each replica. We discuss each case, in turn.

Replicating Updates

There are two approaches to sending a transaction’s updates to replicas: synchronous and asynchronous. In the synchronous approach, when a transaction updates a data item, say x, the update is sent to all replicas of x. These updates of the replicas execute within the context of the transaction. This is called synchronous because all replicas are, in effect, updated at the same time (see Figure 9.4a ). Although sometimes this is feasible, often it is not, because it produces a heavy distributed transaction load. In particular, it implies that all transactions that update replicated data have to use two-phase commit, which entails significant communications cost.

Figure 9.4. Synchronous vs. Asynchronous Replication. In synchronous replication, each transaction updates all copies at the same time. In asynchronous replication, a transaction updates only one replica immediately. Its updates are propagated to the other replicas later.

Fortunately, looser synchronization can be used, which allows replicas to be updated independently. This is called asynchronous replication, where a transaction directly updates one replica and the update is propagated to other replicas later (see Figure 9.4b).

Asynchronous updates from different transactions can conflict. If they are applied to replicas in arbitrary orders, then the replicas will not be identical. For example, suppose transactions T₁ and T₂ update x, which has copies x_A and x_B. If T₁ updates x_A before T₂, but T₁ updates x_B after T₂, then x_A and x_B end up with different values. The usual way to avoid this problem is to ensure that the updates are applied in the same order to all replicas. By executing updates in the same order, all replicas go through the same sequence of states. Thus, each query (i.e., read-only transaction) at any replica sees a state that could have been seen at any other replica. And if new updates were shut off and all in-flight updates were applied to all replicas, the replicas would be identical. Therefore, users working with one replica see the same behavior that they would see with any other replica. In this sense, all replicas behave exactly the same way.

Applying updates in the same order to all replicas requires some synchronization. This synchronization can degrade performance, because some operations are delayed until other operations have time to complete. Much of the complexity in replication comes from clever synchronization techniques that minimize this performance degradation.

Whether synchronous or asynchronous replication is used, applying updates to all replicas is sometimes impossible, because some replicas are down. The system could stop accepting updates when this happens, but this is rarely acceptable since it decreases availability. If some replicas do continue processing updates while other replicas are down, then when the down replicas recover, some additional work is needed to recover the failed replicas to a satisfactory state. Some of the complexity in replication comes from ways of coping with unavailable servers and handling their recovery.

Replicas can be down either because a system has failed or because communication has failed (see Figure 9.5 ). The latter is more dangerous, because it may lead to two or more independently functioning partitions of the network, each of which allows updates to the replicas it knows about. If a resource has replicas in both partitions, those replicas can be independently updated. When the partitions are reunited, they may discover they have processed incompatible updates. For example, they might both have sold the last item from inventory. Such executions are not one-copy serializable, since it could not be the result of a serial execution on a one-copy database. There are two solutions to this problem. One is to ensure that if a partition occurs, only one partition is allowed to process updates. The other is to allow multiple partitions to process updates and to reconcile the inconsistencies after the partitions are reunited—something that often requires human intervention.

Figure 9.5. Node and Communications Failures. Replica 1, Replica 2, and Replica 3 are connected by a network. In (a), Replica 3 fails. In (b), the connection to Replica 3 fails. Replica 1 and Replica 2 cannot distinguish these two situations, yet the system’s behavior is quite different.

Circumventing these performance and availability problems usually involves compromises. To configure a system with replicated servers, one must understand the behavior of the algorithms used for update propagation and synchronization. These algorithms are the main subject of this chapter.

Replicating Requests

An alternative to sending updates to all replicas is to send the requests to run the original transactions to all replicas (see Figure 9.6 ). To ensure that all the replicas end up as exact copies of each other, the transactions should execute in the same order at all replicas. Depending on the approach selected, this is either slow or tricky. A slow approach is to run the requests serially at one replica and then force the requests to run in the same order as the other replicas. This ensures they run in the same order at all replicas, but it allows no concurrency at each replica and therefore would be an inefficient use of each replica’s resources.

Figure 9.6. Replicating Requests. Each transaction runs independently against each replica. In both cases, conflicting updates must be applied in the same order against all replicas.

The trickier approach is to allow concurrency within each replica and use some fancy synchronization across replicas to ensure that timing differences at the different replicas don’t lead to different execution orders at different replicas. For example, in HP’s Reliable Transaction Router (RTR), a replicated request can be executed at two or more replicas concurrently as a single distributed transaction. Since it runs as a transaction, it is serialized with respect to other replicated requests (which also run as transactions). It therefore can execute concurrently with other requests. Transaction synchronization (e.g., locking) ensures that the requests are processed in the same order at all replicas. As usual, transaction termination is synchronized using two-phase commit. However, unlike ordinary two-phase commit, if one of the replicas fails while a transaction is being committed, the other continues running and commits the transaction. This is useful in certain applications, such as securities trading (e.g., stock markets), where the legal definition of fairness dictates that transactions must execute in the order they were submitted, so it is undesirable to abort a transaction due to the failure of a replica.

Replicating updates is a more popular approach than replicating requests, by far. Therefore, we will focus on that approach for the rest of this chapter.