You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Work in progress

This page is in the process of being reviewed and updated.

Introduction

The idea is simple yet extremely powerful. Enable an LDAP server to version itself by tracking changes in a change log. Allow tagging of specific revisions to provide a mechanism for taking snapshots of the server state to potentially be reverted. Thanks to the design of ApacheDS this is something that is not incredibly difficult to implement.

Use Cases

There are several ways in which this feature can be leveraged by our users:

  • [OT] Interesting impact on replication implementation ideas

  • Record a series of changes and play them back.
  • Revert test servers after using them without a costly reinstall and restart
  • To audit changes to answer simple questions:
    • who made the change
    • what was changed (entries, attributes, values)
    • when was the change made
    • how was the change made (add, delete, modify, moddn, modrdn)
  • To inquire about the change history at several levels
    • entire server
    • by a user
    • on a region of the tree
    • by dates
    • by change types
    • per entry
    • per attribute
  • Quickly assess change patters in the DIT
  • Tagging revisions for specific states
  • Rolling back/forward server state
  • Requesting attributes and entries by revision
  • Searching the server for entries by revision

These usage scenarios can occur when managing several things however nothing best characterizes the need than managing environment configurations within an infrastructure. There are situations where the configuration of several LDAP enabled applications may need to be rolled back to an earlier state or inquiries need to be made about who changed what, when for the sake of providing thorough reports to auditors.

What about LDAP transactions?

Some people may say why do I need this versioning and snapshoting feature if I can get LDAP transactions. Well transactions are great but they don't stop mistakes, they just make sure they occur transactionally. Users can still delete or change things in the server incorrectly after committing a transaction. Upon realizing the error the transaction cannot be rolled back if it has already been committed. So no transactions don't save your arse in this case.

History

A couple years ago I came across the idea of versioning changes inside ApacheDS. At first the idea seemed amazing yet insurmountable to implement with pre-1.0 versions of the server. The best we could do at the time was to test and demo the idea with a simple change log interceptor example in a presentation: Embedding ApacheDS. Ersin and I discussed the feature and Ersin took a crack at implementing the log as a text based LDIF log file.

Just recently the idea came back up again while Emmanuel and I were discussing how to speed up our integration tests which were taking a lot of time since for each test we were effectively installing the server, starting, stopping and removing it. Instead Emmanuel recommended cleaning the server without restarts to bring it back to it's original state. Reverting the state using a change log was something that would help fix this problem while being useful as a feature in several situations. See the log of our IRC conversation that kicked off this effort.

Protocol Considerations

As far as the protocol is concerned we have some advantages and good synergy.

revision Operational Attribute

We can use operational attributes to track the revision numbers of changes that took place on an entry. This allows us to use the directory itself to request version specific information. Let's for now presume we created a special revision operational attribute that can only be modified by the directory. The attribute would be multivalued and would be of the INTEGER syntax. Presume revisions are just numbers representing states of the entire directory tree like in Subversion.

With the proper matchingRules we can ask the server about which entries were changed, present, or deleted using filters on this attribute as well as others.

Using LDAP Revision Tags

We must qualify what we mean by a tag since in LDAP there are tags and the concept of a tag also exists in versioning WRT snapshots. These two concepts are not related so please when discussing tags either on our mailing list or in documentation qualify them with either LDAP Revision Tag, or Changlog Snapshot Tag.

LDAP has an interesting attribute tagging feature. It allows one to ask for a value of an attribute in a specific context. For example language tags are used to request different values of an attribute. For example I can ask the server to only return to me English versions of an attribute using the following identifier: commonName;lang-en. Emmanuel can ask for the French version: commonName;lang-fr. Our Stefan's can ask for the German version: commonName;lang-de.

The LDAP tagging mechanism is generalized but this language tag is a specific usage example. We could add an LDAP tag for revisions to be used with attribute identifiers like so:

commonName;rev-23
commonName;rev-9374
commonName;rev-231

This way using the standard protocol users can request values of an attribute associated with a specific revision and the server should comply. Calculating the value would be implementation specific however it can still be made to be extremely performant.

LDAP Request/Session Controls

We could also ask for entire entries to be returned on search or for attribute comparisons based on some revision, date, or range of revisions. Controls can allow us to effect the behavior of the LDAP operation to incorporate versioning semantics. It would be interesting to explore the possibilities with each operation.

LDAP Extended Operations

Taking snapshots are easy but what about reverting the server back to it's original state. Tools might be able to do this if they can access the changelog (where would it be?) especially with reverse LDIFs to revert changes. However issues arise like managing operational attributes such as modifiersName etc. When applying a change log the tool cannot easily and securely apply it with the credentials of each user who made the modification to guarantee that the servers operatioinal attribute house keeping does it's thing. You could get around this if the server supported the LDAP Proxied Authorization Control. Presuming you had the tools, access to the logs, and the proxied authorization control available the tool still could fail during the course of applying the changes and the server could go down to leave the server's state inconsistent with respect to your application data. Oh ok so if transactions actually make it into the specifications we could solve this problem too with tooling.

The best bet is to create an extended operation to rollback the server state to a specific revision. If change log information is not available in the server (say the server sends it to some remote log) then the anti-changes can be optionally supplied within the operation's payload. Regardless of how the changes to revert are made available the server can wait until all current operations complete, then reject requests with a busy response code and start applying the changes in a local transaction with the ability to rollback if the revert fails. Furthermore no connection to a client is required to complete the operation properly without partial rollbacks in case of network failures.

Exposing the ChangeLog via LDAP

Change events and the log can be exposed via LDAP to enable sophisticated searches. Furthermore doing this would enable tooling support to better manage server snapshotting and change reversion.

This can easily be implemented in ApacheDS as a custom partition exposed by the ChangeLog service to view the log information. Of course implementation and backing stores will dictate the quality of the search experience but this is up to implementations to remedy.

Design Proposal

Every change in an LDAP server can be logged and tracked using an LDIF. A special log of LDIF entries can track each new revision of the server. This can be used to audit changes on the DIT, on subtrees of the DIT, their entries and even on attributes. Each entry can track revisions in the change log using

More interesting however is the ability

We don't have any way to define snapshots in the server. This is a very interesting functionality we may want to have. XWhat will it be good for ?

  • being able to log all the requests for debug purpose or for post-mortem analysis
  • create a journal we can replay from a specific point if the server has crashed and we want to restore the backend to a valid state
  • being able to rollback the backend up to a previous state (like SVN/CVS does)
  • use the journal to update other instances remotly and asynchronously
  • many more usages I don't have on top of my head right now (wink)

To be able to implement all those functionalities, we have to implement some basic bricks :

  • a new interceptor to store all the requests
  • some extended operations to set a Tag, to rollback or to discard a tag
  • define the exact semantic of each new functionalities

Simple Logger

The first very basic functionality is to implement a basic logger. It has already been added, but some more elements need to be defined, like the way to start the logger, either via an LDAP extended request, or programmatically.

There is an existing page started by Ersin where you have a description of the existing ChangeLog interceptor :
Logging Subsystem

Semantic

We need to be able to start the logger, to stop it, to configure it, to select the requests, to select the users, etc. Here are a list of parameters we may want to set :

Parameter

Description

Mandatory

Logger name

The logger's name

(tick)

Active

Activate or desactivate the log. Default to TRUE, and when set to FALSE, no other parameters should be given


(error)

Max Size

The maximum size of the log file

(error)

Saved Requests

The list of requests we want to save (AddRequest, DelRequest, for instance)
Default to ALL_REQUESTS



(error)

Saved Attribute

A list of attributes we want to store in the file. It may be associated with the 
previous  selection. '*' and '+' will be used for 'all users attribute' and 'all
operationnal attributes'
Default to '*'


(error)

Subentry

Define the subentry we want the logger to apply when logging. The user may want
to allow some users to log or not, or may define a set of entries which may be logged, etc. 
Default to null : everything is loggable by default


(error)

Rotate

Define the way the file will be rotated : Daily, when exceeding a given size, after a certain period of time


(error)

Active

Activate or desactivate the log. Default to TRUE

 

When we have defined those parameters, we will have to describe them using ASN.1, and to implement the ExtendedOperation.

 

  • No labels