Principles

We have to build a multi-master replication system. That means the if we have N servers, the total number of connections between all those servers will be equals to : N x ( N - 1 ) / 2. If we have 4 servers, we will have 6 possibles connections. If we have 10 servers, we will have 45 possibles relationsetc. For 100 servers, this is 4500 possibles relations !!! Even if this is a polynomial growth, the reality s totally different. We won't allow such a scheme, which is totally out of control. In real world, having only two servers seems quite common, for small to large organization. If more servers are needed - for scalibility of to build a failover system, then we N to N communication network will not be implemented. We will generally select one or two "master" servers which will communicate with a few other servers, like in a chain.

Connection between servers

Between 2 servers

We have two cases : the servers are connected or they are disconnected :

or

This is the simplest possible case, put aside a single server !

We will have to deal with two disconnected servers, and especially with retries when they get connected again.

The main considerations will be to replicate the data which are valid, as some of them may have been updated on both servers.

Another scenario is when an entry is modified on both server and are replicated. Here is a schema which describes the different cases :

At T1, the entry is sent to the other server, and if it's the very same entry, we will have to do something :

If both entries attributes are equals, just send an ack to the other server will be enough.
If at least one attribute is different, we will have to check the CSN of each entry and select the oldest.

Between 3 servers

We will have4 differents cases :

All servers are connected together
One server is disconnected with one other
One server is totally disconnected
All servers are disconnected

Case 1 : All servers are connected

In this scenario, the main problem

Inclusion into the Interceptors chain

Every modification done are sent to all connected replicas. This is done when the modifications are validated locally, and only if everything went right.

Interceptor chain description

The following schema shows how the interceptor chain is working.

Here, a request is injected in the topmost interceptor, which will handle it and if needed, do an associated action, before passing the request to the next interecptor, and so on until it reach the nexus, where this object is managed.

If we have an exception or an error condition in ant interceptor, then we go back to the chain either by throwing and exception, or simply by returning.

An interceptor can be a pass-through for an incoming request (doing nothing with it)

Existing interceptors

We have more than one chain of interceptors, to cover all the different operations

Using an interceptor for Mitosis

The idea is to use an interceptor when a request (add, mod, del, ...) is sent to the server. It should only send a message to other replicas only if everything went fine for all other interceptor. This will then be a passthrough interceptor for the incoming request, which will do all the work at the end.

Potential problems

The main problem is that we should guarantee an ACID transaction , which means that we must set a lock on the object until it has been managed by the nexus.

Add Entry operation

Suppose that we add an entry E_i in the serveur S_i at time T_i, and that we have to propagate this change to server S_k.

This operation will have to deal with those followong cases :

The server S_k is not responding
The entry E_i already exist in serveur S_k
The entry E_i does not exist in serveur S_k
The entry E_i has already been created on another server S_n at T_x where T_x < T_i
The entry E_i has been created on another server S_n at T_x where T_x > T_i

The third case is the most general case.

Case 1

If the remote server is not responding, we should store the entry somwhere, and mark it as "pending for server K". We will have to build a mechanism to resend the entry when the server is up again.

We also have to take in consideration the fact that we are not dispatching the entry to a single remote server, but to a set of servers, and that more than one server can be unresponsive.

The best approach here would be to keep track of all the remaining remote server within the entry : let's assume that for en entry E_i, we have to update N servers. We will store a list of remote server associated to this entry, and when a server send an Ack, then we will remove its name from the list. When the list is empty, we can consider that the entry has been replicated on all servers.

If one or more server cannot be contacted, this list will keep the non responsive server's name, and a thread will try regularly to send the data to those servers, until either they respond, or we reach a timeout. In this last case, we will have to mark the server as unavailable, and inform the administrator. When this server will be reconnected to the grid of Ldap servers, it will have to be synchronized with any server of the grid.

Child pages

Mitosis Development Guide