ApacheDS Bootstrapping

Introduction

This document describes the process where ApacheDS is first started up, starts to initialize various components, to finally reach a solid state of operation where it can service LDAP request along with other directory backed protocol requests.

We break up this process into several different categories as phases in the bootstrapping process, since each phase involves a different set of facilities and components. The phases are:

Daemon Bootstrapping Phase
Initial Bootstrap Schema Load Phase
Full Schema Load Phase
Configuration Load Phase
Wiring and DI Phase

Daemon Bootstrapping Phase

Several daemon frameworks have been used in the past to make ApacheDS' process act like a daemon on *NIX platforms and as a service on Windows platforms. These frameworks were Procrun, Jsvc, and the Tanuki wrapper.

Each framework has a means to start up a Java program and set it up so it detaches from the console and becomes a daemon on UNIX and/or a service on Windows. How this is done is abstracted away by some minimal code in ApacheDS used to bootstrap it. Some of this code is very specific to the wrapper and some is quit general.

Bootstrappers

In ApacheDS' repository there's a daemon-bootstrappers project which contains a base Bootstrapper implementation and some helper classes along with various framework subtypes of the Bootstrapper.

The goal here was to adapt the set of events handled by various frameworks to startup a Java based daemon, to a generalized interface. In this case the generalized interface is really a class called Bootstrapper.

Tanuki Bootstrapper

The Tanuki framework has a TanukiBootstrapper implementation extending the Bootstrapper class. On startup the Tanuki framework calls the main(String[]) method of this TanukiBootstrapper. The apacheds.conf file contains configuration entries telling Tanuki to start ApacheDS by calling this class' main method with various arguments.

The static main() method in the TanukiBootstrapper simply instantiates a new TanukiBootstrapper instance and asks the wrapper framework (WrapperManager) to start up the TanukiBootstrapper. Here are the String[] args handed off to the main() method:

/usr/local/apacheds-1.5.5-SNAPSHOT, 
org.apache.directory.server.Service, 
/usr/local/apacheds-1.5.5-SNAPSHOT/instances/default/conf/server.xml

Future Work

We envision a configuration free of Maven plugins to generate the installers. We also would like to modify the InstallationLayout bean and the InstanceLayout bean so that the InstanceLayout bean has a handle on the InstallationLayout. The InstanceLayout bean will be provided to higher layers for the server to properly conduct it's bootstrap process in later phases.

Initial Bootstrap Schema Load Phase

We store the schema in a partition. We also store configuration state information regarding which schema is to be enabled and which schema are disabled in this partition. Everything that is needed for loading and setting up the schema subsystem is located in this partition.

Also this partition can use any kind of backing store, or Partition interface implementation so those wishing to store schema information in various backing stores can do so.

Pre-packaged Schema LDIF Files

The ldap-schema project in shared produces a jar file which contains LDIF files within in a specific layout. These LDIF files encode schema information for various published schema as well as a minimal set of critical schema specifically required for ApacheDS operation.

The layout in the jar file contains a top level schema directory. Under this there is a schema.ldif file and yet another schema directory. This ldif and the directory is analogous to the ou=schema context in ApacheDS.

Under this second schema directory you'll find a set of directories with the names of various schema in lower case:

apache
apachedns
apachemeta
autofs
collective
corba
core
cosine
dhcp
inetorgperson
java
krb5kdc
mozilla
nis
other
samba
system

Along side each of these schema directories rests an LDIF file with the same name plus the .ldif extension. So next to the java directory you'll find java.ldif.

Under each schema directory there is a standard structure for storing the various kinds of schema entities: syntaxCheckers, syntaxes, comparators, normalizers, matchingRules, attributeTypes, objectClasses, nameForms, matchingRuleUses, ditStructureRules, and ditContentRules.

For each kind of schema entity you have a directory with the name of the entities above. Like for example there is a schema/schema/java/attributeTypes directory in the java schema for storing LDIF files of each attributeType in this schema. For the same name of the this directory you have an LDIF file beside it: schema/schema/java/attributeTypes.ldif. Inside each schema entity directory you'll have LDIF files. The LDIF file will use the OID of the schema object as the base file name and the standard LDIF file extension. So for example in the java schema you will have this file: schema/schema/java/attributeTypes/1.3.6.1.4.1.42.2.27.4.1.10.ldif. The file for example contains the following LDIF:

dn: m-oid=1.3.6.1.4.1.42.2.27.4.1.10,ou=attributeTypes,cn=java,ou=schema
m-equality: caseExactMatch
m-usage: USER_APPLICATIONS
objectclass: metaAttributeType
objectclass: metaTop
objectclass: top
createtimestamp: 20090818022732Z
creatorsname: uid=admin,ou=system
m-name: javaFactory
m-oid: 1.3.6.1.4.1.42.2.27.4.1.10
m-singlevalue: TRUE
m-description: Fully qualified Java class name of a JNDI object factory
entryuuid:: bVjvv73vv70Q77+9Q8ySWmNj77+977+9dwM=
m-collective: FALSE
m-obsolete: FALSE
m-nousermodification: FALSE
entrycsn: 20090818052732.339000Z#000000#000#000000
m-syntax: 1.3.6.1.4.1.1466.115.121.1.15

The schema used to encode the schema itself is called the apachemeta schema. These m- attributes and the metaXXX values you see for objectClass all come from the apachemeta schema. So as you can see to properly load and make sense of this data you at first need to know the apachemeta schema and this presents a slight chicken and egg problem for us that we must overcome.

So this apacheds-ldap-schema-[version].jar file has all these LDIF files zipped up inside it. In addition it contains a special extractor which will extract out this zipped schema repository onto disk.

Solving the Chicken and Egg Problem

As we stated earlier we have a slight chicken and egg issue to overcome. To load the schema into registries from a partition containing this information we have to be aware at least of the apachemeta schema and the other schemas it depends on. This dependencies pull in the following schema as the critical set of schemas required to read and bootstrap all schema:

apache
apachemeta
core
system

Let's call this critical schema set, the bootstrap schema set. We need to load a set of registries with this schema set in order to initialize a partition containing entries storing all the schema information to be loaded. Remember our server is designed to store schema in a partition which is accessible via LDAP in this well organized structure in addition to using the standard LDAP mechanism which uses a schemaSubentry.

So how do we do this? These 4 schema are so critical that they are read only. No administrator should ever be allowed to change the elements in the bootstrap schema set. For this reason we are comfortable loading these critical elements from the ldap-schema jar without extracting them to disk. We also read them without having active registries.

Without active registries schema checking is not possible. Nor is schema integrity checks possible which make sure the schema is right before loading it. Instead we just load and hope all goes well in the end with a post load integrity check. This however is not a major issue if no one is allowed to change the bootstrap schema set.

So the initial bootstrap schema loading stage loads these 4 critical schema into the registries. This is done by a special Partition implementation called SchemaPartition. The SchemaPartition uses a special JarLdifSchemaLoader class to load these four schema into a Registries object.

The SchemaPartition itself does not store schema data in some persistent store. Instead this Partition delegates persistence to a wrapped Partition that it contains. The idea here is to allow anyone to replace the underlying storage used to store schema data in ApacheDS. Hence any Partition implementation such as the, OraclePartition, JdbmPartition, or LdifPartition can be plugged in.

Regardless of the Partition implementation used, the partition must be self configuring. Meaning it must not require an external configuration to come online and provide access to the schema data it stores. This is because when this partition is brought online, the configuration partition of ApacheDS is not yet available.

Full Schema Load Phase

The SchemaPartition uses the initial critical set of bootstrap schema to initialize it's wrapped Partition, whatever implementation it may be. The SchemaPartition delegates Partition interface operations to this wrapped Partition while also updating the registries to reflect the changes to schema during solid state operation of the server.

Once the wrapped Partition with our schema content in the expected DIT namespace is initialized the second full schema load phase can begin. In this phase, a second schema loader is incorporated to load the schema from within a Partition. This loader uses the Partition interface methods to search the Partition and load the Registries with all the schema enabled in this partition. For example in the ou=schema,cn=java entry you'll find the same content that was bundled into the ldap-schema jar LDIF schema/schema/java.ldif:

dn: cn=java,ou=schema
entryuuid:: 77+9Fxdi77+9e07vv73vv73vv73vv73vv70g77+9aFQ=
creatorsname: uid=admin,ou=system
createtimestamp: 20090818022726Z
cn: java
objectclass: metaSchema
objectclass: top
entrycsn: 20090818052726.390000Z#000000#000#000000
m-dependencies: system
m-dependencies: core
m-disabled: FALSE

NOTE the m-disable attribute is set to FALSE. This LDIF or entry in the wrapped Partition tells the PartitionSchemaLoader that it must load the Java schema into the registries in this full schema load phase. The first 4 schema in the critical bootstrap schema set do not need to be reloaded since they are already enabled and present in the Registries object due to the initial bootstrap schema load phase.

Child pages