In Proposal for Geode Modularization, high-level goals were proposed. This section will try to explain, in greater detail, how these goals will be realized. For each goal, technologies will be proposed and how these would affect the implementation detail.
Well defined modules provide the following benefits:
A single module will contain all code to complete its function
Can be composed of many smaller modules to create single larger module
Has a well defined public interface
Internal implementation is “hidden” from external modules
Can be tested in seclusion as a black box
Behavior can be clearly defined and tested
Reduces the domino effect that tightly coupled applications experience
Easier to improve/maintain/upgrade single modules without affecting other modules
Defines and contains all required dependencies
Separation of Concerns
With each module defining all of its dependencies (both external module and library) it is highly beneficial to not have one module’s dependencies “spill over” affecting another module. To avoid dependencies affecting one another one has two approaches:
Making sure that all modules use the same library versions
Make sure that each module runs in its own “world”
At the beginning of an application’s life, it is simple to control the versions of all libraries. With each module using the same library version. Inevitably, as modules start evolving at their own pace, newer versions of libraries or different libraries will be required. This would mean that strict controls need to be put in place to avoid library version mismatches, which inevitably result in module evolution suffocating under these strict controls.
The other option is to have ClassLoader isolation/separation. This is a more viable option, as this approach allows modules to only load classes that are related to it. This enables modules to define library version dependencies independent of any other module’s library dependencies.
In most applications ClassLoader isolation is not required, because all dependencies are well contained. In Geode’s case, the ClassLoader isolation would be perfect, as each module can be a self-contained component. In addition application code that is deployed into Geode can now include different versions of libraries used by Geode without conflict.
After some research, it was found that JBoss-Modules addresses all the ClassLoader isolation and dependency isolation concerns.
“JBoss-Modules is a standalone implementation of a modular (non-hierarchical) class loading and execution environment for Java. In other words, rather than a single class loader which loads all JARs into a flat classpath, each library becomes a module which only links against the exact modules it depends on, and nothing more. It implements a thread-safe, fast, and highly concurrent delegating class loader model, coupled to an extensible module resolution system, which combines to form a unique, simple and powerful system for application execution and distribution.
JBoss Modules is designed to work with any existing library or application without changes. Its simple naming and resolution strategy are what makes that possible. Unlike OSGi, JBoss Modules does not implement a container; rather, it is a thin bootstrap wrapper for executing an application in a modular environment. The moment your application takes control, the modular environment is ready to load and link modules as needed. Furthermore, modules are never loaded (not even for resolution purposes) until required by a dependency, meaning that the performance of a modular application depends only on the number of modules actually used (and when they are used), rather than the total number of modules in the system. And, they may be unloaded by the user at any time.” -- GitHub JBoss-module description
With the introduction of modularity, the notion of modular dependency management has been introduced. As each module can depend on other module(s), the management of loading these module(s) become critical. The system needs to be able to determine when to load modules, based on their dependency graphs. Without this capability, it is left to the user to determine the dependency graph.
Maven and Gradle are well-known tools in the library dependency management space. Both tools allow applications to specify required libraries without having to list all dependent libraries for the required libraries. This allows developers to use/upgrade libraries without any knowledge of the dependencies of libraries in question.
Modules need to behave in the same manner. A user/application needs to be able to list a module and version without the concern of loading and managing any dependencies other than their own.
JBoss-Modules will not only provide classloader isolation, but also the capability to define modules and their dependencies. It will also manage the loading and resolution of the dependent libraries or modules as required.
Dependency Injection is the most ubiquitous form of Inversion of Control. The most commonly known DI implementations in the Java world are Spring, Google Guice and CDI (Java’s own impl). Although Spring and Google Guice are really good, they are bespoke, not allowing for other frameworks to be used without significant refactoring. CDI, on the other hand, is a JAVA specification with two common implementations: Weld or Apache OpenWebBeans.
To keep this short, all the DI/CDI technologies will provide dependency injection. Given the goal of ClassLoader isolation, some the frameworks will not work. Both Spring and Google Guice, by design, neither implement nor support classloader isolation. This only leaves us with the CDI implementations. Which work well with JBoss-modules and its Classloader isolation.
The main CDI implementations would be Weld or Apache OpenWebBeans. At this stage, it is unclear which of these two implementations are to be used, as both as ASF2 licensed and both are actively developed. At this stage, Weld is close to releasing v3.0 which will include the CDI 2.0 specification, whereas Apache OpenWebBeans only supports CDI v1.2.
CDI provides a mechanism to intercept method invocations using either an Interceptor or a Decorator. In essence, these two constructs are the same as both provide the capability to intercept method invocations and run an action. The difference between an Interceptor and a Decorator is: Interceptors are used for cross-cutting concerns over many classes like logging and security and Decorators are type-safe and specific per type, usually to enhance some business logic for a type.
Interceptors are powerful constructs, functionality can be added to run before any method, without adding intrusive code to every component/class.
Using Interceptors will allow for the simple addition of:
The module dependency manager (MDM) is responsible for the management of dependencies between modules. When managers start they register themselves with the MDM, providing their contextual path and the modules it depends on.
The MDM will build up a dependency model that is used in the starting and stopping instances. The MDM will use the dependency model to avoid dependencies starting/stopping in the wrong order.
The manager is an MBean and comprises of 4 pieces:
A list of parent managers with which it has a “depends-on” relationship, “injected” as part of the configuration lifecycle.
Configuration used to create an Instance
The reference to the “managed” instance
Operations, metrics and configuration methods
A manager will provide a single API, via JMX, for the configuration and management of an instance. The configuration service will interact with the manager to add/change/update the configuration of the instance.
Each manager is responsible for the starting of its referenced instance (creating an instance) based upon the stored configuration. The manager will interact with the MDM to, get a list of its direct dependents and request that they start themselves. Once all dependent modules have successfully started, it will create its managed instance based upon the stored configuration.
This proposal adds to Geode the concept of a manager. A manager should be seen as the owner of an instance. Managers and Instance have a 1:1 relationship. Every instance will have a Manager associated with it. I.e RegionA will have a RegionManager containing all its configuration and RegionB will have its own RegionManager container all the configuration to instantiate RegionB. There are two managers, not 1 manager for 2 regions.
The Manager is responsible for the configuration, starting and stopping of the Instance. It will be the single path to maintain an instance.
The instance is merely a reflection of the configuration. All component dependency injections will be handled by the manager. The manager will then decide how to register a dependent component. This is true for both the configuration stage (pre-startup) and any runtime changes.
The manager will also expose all metrics for its underlying instance, via JMX. This provides a single location from which metrics can be collected for any component.
The configuration service is responsible for the processing of configuration and then the creation of Manager representing the relevant configuration.
The configuration service has a single API that allows for the configuration of the system. The configuration service provides the ability to plug in different ConfigHandlers. These ConfigHandlers provide the ability to configure the system using different mechanisms, like XML, REST, Spring or any other custom handler.
Bootstrapping is an automated process through which a system is started without any external intervention. The bootstrapping component will be responsible for:
the coordination of the component configuration
Starting/stopping of components and their dependents
The image below depicts that lifecycle that the configuration of a component will follow.
Managers are configured via the configuration service. (this can be in the form of XML, Spring, GFSH, REST,...)
In the case that a manager requires more managers to be created, they will recursively be created at this step. E.g The Lucene modules requires its own regions to be created for internal management.
Once all configuration has been completed a manager can be started. I.e The starting of a manager means that an instance is created from the configuration stored within the manager. As a manager does not have a started/stopped state, the starting of stopping of a manager means that an instance is created of destroyed.
Typically the “root manager” is asked to start, which means that recursively all managers are eventually requested to start. BUT it is possible to start any manager, which means that all dependent managers for that manager will be started to fulfill the request.
As part of the start phase, the manager will ask each of its dependent managers to start. This is a recursive step until a manager is found that can be started, as it has no more dependents. Once the instance is created the manager can add the newly created instance to the parent manager’s configuration. (a cachelistener being added to the region configuration)
The creating of instances and configuration of instances into parent configurations will “unwind” and eventually the root manager has no more dependent managers that need to be started. The root manager will then start and the system is available.
The starting of a Manager is explained in the following steps:
Get the component Manager
Determine is the Manager has dependent managers
If there are dependents, iterate over dependent managers and recursively ask them to start themselves
If no dependents exist, create an instance of the component using the configuration stored on the manager
*Optional: After creating the instance, the instance can then be used and injected into the configuration of a “parent” manager, to be used for its creation.
A component is considered “started” when it has an instance and all its dependents have instances.
ModuleA → ModuleB
ModuleB → ModuleC
ModuleC → no dependencies
We'll assume that each module contains a single component, corresponding to the module name.
The AppManager will create 3 Component Managers:
The Bootstrap lifecycle will now ask each manager to start.
Manager A is asked to start, but it has 2 dependencies, Manager B + Manager C. Iterating over the dependencies, Manager B is then asked to start. It too has a dependency, Manager C, which it then asked to start. Manager C does not have any dependencies and it starts C`. After starting C`, Manager C will request Manager B to add C` to its configuration.
With Manager C having completed, Manager B is now able to start. It creates B` and Manager B will request Manager A to add B` to its configuration.
With Manager B having completed, Manager A can now iterate to Manager C and ask it to start. As it had already started Manager C will request Manager A to add C` to the configuration.
Manager A can now start, given that its 2 dependents having started. Manager A then starts A`.
A` will contain references to both B` and C`, as per its configuration. B` will contain a reference to C`, as per its configuration.
With a more modular Geode, it now becomes possible to add/extend functionality. There are 2 types of additions in Geode that can be supported with this extensions option.
A stand alone module with any dependency on Geode
A module with dependency on events from Geode
The first option is the simplest. All that is required is to define the module, its components, and its configuration. This new module will then just be added to Geode by adding it to the classpath. The AppManager will process the relevant configuration and start the corresponding Managers and Instances. A good example for this option would be the Redis adapter for Geode, which is currently experimental. The Redis adapter module is completely stand-alone and requires events from the Geode system to function.
The second option is a little more complicated. A module or component requires to “hook into” the current eventing the Geode provides. Current Geode events would include all events received by the CacheWriters, CacheListeners or AsyncQueues. The best way to describe this is per the example used in the Bootstrapping Lifecycle example. Upon the creation of the instance, the manager will then request the dependent manager to register/add the current instance to the configuration of the dependent manager.
This way any EventHandler, like CacheWriter/Listeners or AsyncQueues, will be able to register themselves to the region that is sending the events. A good example of this would be the integration of Lucene in Geode.
Using the below image the steps would be explained:
The configuration is parsed by the configuration service
The ConfigurationService creates a CacheManager
The ConfigurationService creates a RegionManager with a dependency on the CacheManager
The ConfigurationService creates a LuceneManager with a dependency on the RegionManager created in Step3
As part of the creation on the LuceneManager, the LuceneManager creates a LuceneAsyncEventQueueManager
After creating the AsyncEventQueueManager (Step 5), the LuceneManager creates a LuceneRegionManager
The system is requested to start. Using the steps described for bootstrapping the LuceneRegionManager is requested to start, thereby creating a LuceneRegion(s)
The LuceneAsyncEventQueueManager then starts the LuceneAsyncEventQueue
The LuceneAsyncEventQ is registered/added into the configuration of the RegionManager
The RegionManager creates the Region with the LuceneAsyncEventQueue configured as an AsyncEventQueue to the Region
The CacheManager creates a Cache and completes the start-up cycle.
Region sends events to LuceneAsyncEventQueue
Service Provider Interfaces provides the ability for 3rd party vendors to implement services to extend functionality or replace components. Geode will also have an SPI that will be used to define and extend/implement the Geode services.
The SPI will comprise of the following:
All classes relating to the service implementation
A service descriptor
*Optional* ConfigHandlers for different supported configuration mechanisms