Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

This document describes design choices made regarding the ChangeLogService.

h2 Glossary

We need to define precisely some terms before starting to describe this service, to avoid any semantic confusion.

  • Transaction : <to be done>
  • Revision : <to be done>
  • Event Log : <to be done>
  • Transaction Log : <to be done>
  • Journal : <to be done>

Protocol Changes

  1. Add the revisions attribute to the apache schema and enable addition of attribute to entries
  2. Add the Proxied Authorization Control (needed for replay)
  3. Add extended operation to take a snapshot
  4. Add extended operation to rollback to a previous revision
  5. Add capability to request specific versions of an attribute using the LDAP rev tag

Representing Changes (deltas)

...

Section
Column

Column

So the whole point to this exercise was to clearly define what we need to track for a change. Obviously we need a unique auto incrementing synchronized sequence to pull new revision numbers from. This will be used for the revision of a change. Each revision has a change associated with it. Here's a list of what needs to be tracked for each change:

  • Revision Number: number assigned to the new state of the server once the change is applied
  • Forward LDIF: the LDIF applied to switch from S0->S1 (rev0->rev1)
  • Reverse LDIF the LDIF applied to revert from S1->S0 (rev1->rev0)
  • Change Timestamp: the time the change occured (GMT)
  • Principal: the distinguished name of the authorized user that made the change

The following page present the way reverse LDIF are generated : Generating reverse LDIF

Change Stores

Section
Column

There are different levels to which we can implement this feature. I think we should enable different pluggable implementations for the change log service to expose different levels of functionality. To enable this we have to design a few different interfaces for the service and it's subcomponents. The following levels of functionality should be possible:

  • Basic Change Log: logs changes only (no snapshots)
  • Searchable Change Log: logs changes and allows searching on changes
  • Taggable Change Log: allows tagging for snapshots
  • Searchable and Taggable Change Log: allows tagging and taking snapshots with search capabilities

The same change log logic can be used to swap out different components of a log store to provide varying capabilities and still apply tags and changes to the store interface. Let's take a look at some of the store interfaces which really act as an SPI for this subsystem.

There are 4 kinds of stores represented:

  • ChangeLogStore: the simplest kind of change log store that can be implemented (just for logging)
    • Primary method is
      Code Block
      long log( Principal dn, Entry forward, Entry reverse )
    • Another method exists to get the current revision number (should be published in RootDSE perhaps)
  • TaggableChangeLogStore: this store allows for tagging for snapshots
    • Two tag() methods exist one which generates a tag on the current revision another on a revision in the past
  • SearchableChangeLogStore: a simple store which exposes access to a search engine over change log events
  • SearchableTaggableLogStore: a taggable log store which enables searching over changes and tags
Column

...

Section
Column
No Format
# REVISION: 1234
# PRINCIPAL: uid=admin,ou=system
# TIMESTAMP: 200706202342343Z
dn: cn=jane doe,ou=users,ou=system
objectClass: top
objectClass: person
cn: jane doe
sn: doe
Column

The revisions and relative position from tail or head tie together the forward and reverse LDIF. This can be used to selectively extract the changes to revert the server to some state other than the start state where revision = 0. Obviously this is not the most high performance implementation one can have but we're not going to be performing any complex search operations over this data.

Another thing this ChangeLogStore can do is backup the log files after reaching a certain size. I don't recommend doing this since it does not really serve a purpose. All the records are needed for history so you don't want to delete the files anyway. Keeping backups of smaller files just creates the problem of figuring out application order with separate files. It can be done with prefixes listing the start and stop revision numbers in the file name but this is not worth the hassle.

Semantics

...

Also note that if the history is cleared you should be able to start from zero again presuming the start

...

you're in to be the start state: this might be useful for staging changes which does not require the entire history.

Tip
Appending to Compressed Log Files
Appending to Compressed Log Files

The best feature one can add to such an implementation is to store the log files in zipped form and insert new entries into them without having to expand the entire file. This however is the only feature worth adding to such a simple implementation.

A separate file can be used to track snapshots. A simple properties file can be used for this where the key is the revision number for the snapshot tag and the value of the key is the description for the snapshot. This probably will not grow very large at all. Another file can be used to persist the current revision or a pointer could be kept on the head or tail to quickly read the REVISION information in the comments of the last entry added to either the forward or reverse LDIF file.

The change log should be a simple interceptor for the time being and can be configured via Spring or programmatically to be added to the interceptor chain. By default it should be disabled. Users can enable server versioning if they would like by uncommenting the interceptor.

Configuration information may be needed for the following possible settings:

  • changeLogDirectory [path url] - where to put the changelog files
  • compressChangeLog [boolean] - whether or not to keep the change log ldif files compressed