Xdbm fast modifyDn proposal

Introduction

We will introduce some alterations to the Xdbm index structure to turn expensive, non-atomic and error prone 0(n) modifyDn operations into cheap constant time atomic operations.

Background

Xdbm is a generalized partition design based on Table and Index data structures. These can be backed by any kind of key value based data structure in memory or on disk with support for key sorting: b+trees on disk and AVL trees in memory are often used. Some background on the organization before this change has/was made might be needed. Please consult the following page for more info: Xdbm Partition Design.

The Problem

This optimization dramatically improves the time taken to rename, move or rename and move a subtree of the DIT when performing a modifyDn operation. Currently this operation is O(n) where n is the number of descendants under the node being altered. Besides requiring n updates to the NDN and UPDN tables for each descendant, the entry for each descendant must be retrieved from the master table in order to change the DN stored with it. The massive number of master table accesses churns the cache erasing cache memory and resulting in a lot of object creation with subsequent garbage collection, deserialization, and IO to disk. This has a massive negative impact on any Java based LDAP server's ability to perform. In all likelihood a modifyDn operation on a parent with a large number of descendants will fail before completion, most likely with an OutOfMemoryException.

Besides the performance issues associated with this problem, there are several other issues associated with LDAP operation semantics namely in the areas of atomicity, consistency and isolation. If the number of descendants are huge, this operation could take a long time to complete. The probability of search requests issued by other clients during this period increases with n.

At any point in time before the modifyDn operation completes the DIT will be inconsistent. Entire branches under the modified entry will disappear as their parents or ancestors are being modified before they are. ModifyDN operations in progress and in the scope of a concurrent search operation will produce an inconsistent view of the DIT. Not only does this present a consistency issue but there is a greater exposure to a lack of isolation.

A modifyDn operation must alter all descendants to properly complete and be consistent. If at any point a failure causes this operation to halt the DIT is left inconsistent. There is little that can be done to rollback the change at this point in our design (this should change in the future). If however the change was to a single entry, then any failure during the course of the operation would effectively make the operation follow LDAP's atomicity rules. The operation will have failed and the change would not hold.

The Solution

We propose removing the DN from the entries stored in the master table and doing away with both dn indices: the normalized ndn index and the user provided updn index. In place of these a new rdn index is added. This index will store the rdn of an entry mapped to the ID in both directions as all indices do.

The DN of an entry can be constructed using this rdn index in concert with the one level hierarchy index. The rdn stored is a serialized form of Rdn objects which contains both the normalized rdn and the user provided rdn. When a candidate needs it's DN constructed the candidate's ID is used to extract an Rdn, then the process is repeated with the parent which is discovered via the one level index. Eventually all the Rdn values in the entire DN are retrieved and the LdapDN is constructed.

This optimization now allows entry DN modifications to be constant time, requiring only a single change on the rdn index. The operation is constant time and atomic with full isolation.

Both the rdn and the one level indices should have ample amounts of cache enabled since they will be aggressively accessed in this new design.

Let's take a look at an example DIT structure and the contents of the indices. Here's a quick and dirty picture of what the new organization looks like:

Here you see the rdn and one level indices as well as the master table which stores the entries. An example DIT has been drawn to the right of the image with annotations showing the ID of the entry. The Rdn values are shown in the rdn index mapped to the integer IDs of the entries. The one level index has all the parent to child mappings for this example directory structure.

DN Resolution

DN resolution from entry ID values takes n-1 reverse lookups into the rdn index and n-1 reverse lookups into the one level index where n is the depth of the entry. An example resolution is performed on entry 5, uid=ayyagari,ou=Users,ou=system. Here are the steps taken:

reverse lookup on rdn index with ID=5 to get uid=ayyagari
reverse lookup on one level index with ID=5 to get parent ID 3
reverse lookup on rdn index with ID=3 to get ou=Users
reverse lookup on one level index with ID=3 to get parent ID 1
ID 1 is special and tracked for the context entry of the partition so a reverse lookup is not required to get the suffix ou=system

During this process the LdapDN for ID 5 entry is assembled.

ModifyDN is Fast

A modifyDn operation in this case is a constant time operation reduced to a simple update on the rdn table. For example if we modify ou=Users,ou=system to ou=People,ou=system, then this operation will update the index entry for (ou=Users, 3) to now be (ou=People, 3). The change is atomic and occurs rapidly without making changes to the children or requiring access to the master table.

ID Resolution

Resolving the ID of an entry requires lookups on ...

Pros and Cons

Slower Search?

The problem with this approach is that it will slow down search to a degree with more accesses to the rdn and one level index. However most directories are flat or less deep then they were years ago. Because of this reason the average depth of an entry can be expected to be small. Also an optimization is possible with search since it provides a base. The base can be used as a starter to construct the DN of the entries returned by search. This way the lookup process to construct the DN starts with the rdn of the returning search result entry and stops with the base. If the base has a depth of b then n-b-1 lookups are performed on both the one level and the rdn index.

Master is not enough

All the information needed to reconstruct the database is not present in the master table. The rdn index and one level index are needed.

DN duplicates information and costs more to extract

DN lookups are more costly since more data is serialized back and forth. Plus there's a great amount of redundant information stored in the dn indices. If there are 1 million people under the ou=Users,ou=system entry then 1 million copies of ou=User,ou=system are started in the DN indices. With the new configuration there is no duplication, and the size of the data is minimized with just the Rdn.

A related problem are attributes that store a distinguished names, for example the groupOfName's member attribute or the modifiersName attribute. When modifying the DN of an entry we also need to update all attributes of all entries that have a reference to the modified entry. A solution would to not store the DN of the referenced entry, but its ID or entryUUID.

Child pages