Overview
The goal of this proposal is to define clear lines between modules of geode. The goal of is that each module should
- Be mockable - so that that other modules can write tests that don't depend on that module
- Be tested within the module. In other words, if WAN depends on the Region module, I should not need to run any WAN tests to test all of the Code in LocalRegion.
In order to accomplish these goals, each module needs to have a clear API for the rest of the system to use, and tests within that module that cover all of the features of the API. Each module should have at least it's own package(s), if not be in a separate gradle module. Anything that is not part of the API for that module should not be accessible outside of the module.
Here's a start at listing the proposed modules and their dependencies. These are proposed dependencies; there currently there many cyclical dependencies. Part of the required changes will breaking some of these cyclic dependencies by replacing hard coded references to other modules with plugins and callbacks that are part of the well defined API for each module.
Packages
Client
Package: internal.cache.client
API interfaces/classes: AbstractOp, ExecutablePool
Required Changes: Operations (The client side code for client-> server messages) for other modules should be moved to their respective packages.
ClientSubscriptions
Package:internal.cache.ha (change this?)
API interface/classes: ?
Required changes: ?
DLock
Package:internal.locks
API interface/classes: InternalDistributedLockService (new interface)
Required changes: ?
Dunit
Package:dunit
API interface/classes: DistributedTestCase,CacheTestCase
Required changes: This code should be moved into it's own gradle module.
Events
Package: internal.cache.event (new package)
API interface/classes: InternalEntryEvent. RegionEntry
Required changes: ? These events are passed everywhere, which is why it would be nice to refactor this code into a separate package that other packages can depend on.
Eviction
Package: internal.cache.lru
API interface/classes: EnableLRU, LRUClockHand (new interface)
Required changes: ?
FunctionService
Package: internal.cache.execute
API interfaces/classes: FunctionService, Execution, Function
Required changes: Region code should be refactored to not have direct dependencies on this package. For example, LocalRegion should not have function execution code in it.
Indexing
Package: query.internal.index
API interface/classes: ?
Required changes: ?
Logging
Package: internal.logging
API interface/classes: ?
Required changes: ?
Locator
Package: distributed.internal.tcpserver
API interface/classes: TcpHandler, ?
Required changes: ?
Messaging
Package:distributed.internal, distributed.internal.advisor
API interfaces/classes: InternalDistributedMember, DM, InternalDistributedSystem, MembershipListener, DistributionAdvisor, DistributionAdvisee
Required Changes: InternalDistributedMember and InternalDistributedSystem should be interfaces, not concrete classes. They should only have the methods that are required by the rest of the system. The concrete classes like DistributionManager, the old InternalDistributedSystem class, etc. should not be referenced outside this package.
TBD - The proposal here is that membership is another module that is hidden behind the messaging layer as far as the rest of the system is concerned. The membership layer has it's own interface that should hide its internals from the messaging layer. Advisors are lumped in here to reduce the complexity of the high level graph.
OffHeap
Package: internal.offheap
API interface/classes: ?
Required changes: ?
PDX
Package: pdx.internal
API interface/classes: ?
Required changes: ?
Persistence
Package: internal.cache.persistence
API interfaces/classes: InternalDiskStore, DiskStore, DiskRegionView, DiskId
Required Changes: InternalDiskStore is new interface for DiskStoreImpl.
Querying
Package: query.internal
API interface/classes: QueryService, ?
Required changes: ?
Regions
Package: internal.cache.region
API interfaces/classes: InternalRegionService, InternalRegion,
Required Changes:
InternalRegion is new interface for LocalRegion. RegionService is a new interface that has the functionality from GemfireCacheImpl to manage regions. LocalRegion, etc. should not be used outside this package.
Direct references to other modules, for example LocalRegion.notifyGatewaySender, should be turned into callbacks that other modules plug into the Region interface. Those callbacks should be tested within the region module.
This is still the big ball of string that needs to get untangled further. We need to split out expiration, conflict detection, GII, transactions, partitioning, etc.
ResourceManager
Package: internal.cache.control
API interface/classes: InternalResourceManager, MemoryEvent, ResourceEvent, ResourceListener
Required changes: ? Move rebalancing related classes to the region package?
Serialization
Package:internal.serialization (new package)
API interfaces/classes: InternalDataSerializer (interface?), DataSerializableFixedID
Required Changes: ?
Server
Package:internal.cache.tier.sockets (change this?)
API interfaces/classes: CacheServer, CommandInitializer, BaseCommand
Required Changes: Commands (the server side code for a client-> server message) for other modules (eg WAN) should be moved to their respective packages and registered with CommandInitializer.
Snapshots
Package: internal.cache.snapshot
API interface/classes: SnapshotService
Required changes:
Statistics
Package:internal.statistics (new package)
API interfaces/classes: Statistics, StatisticsFactory, StatisticsManager
Required changes: Move into a separate package. Pull the code out of InternalDistributedSystem (it currently implements StatisticsFactory) into a separate class
Versioning
Package: internal.cache.versions
API interface/classes: RegionVersionVector (new interface), VersionTag, VersionStamp
Required changes: ?
WAN
Package: internal.cache.wan
API interfaces/classes: AsyncEventQueue, GatewaySender
Required Changes: Region code should be refactored to not have direct dependencies on this package. For example, AsyncEventQueues should be notified through a listener installed on the region. The listener interface will be part of the region package.
Questions/Issues
What to do with GemfireCacheImpl,Cache?
The Cache interface currently has dependencies on almost all of the modules of geode because it has methods like getQueryService, getGatewaySenders (WAN). Unfortunately, Cache, InternalCache, or GemfireCacheImpl is used as a context object that also passed to almost all modules of Geode.
We need to rework how we inject dependencies into all of these modules. If we want a context object, it should be something that is generic that does not pull in dependencies on all other services, something likes spring's BeanFactory. But it might be better if the specific dependencies for each module were passed into that module.
Modularity in the public API
We've already started creating a few separate modules at the external level - for example the lucene integration or the auto rebalancer. We need to nail down how these extensions are accessed by the user. One option might be to add a method to Cache like Cache.getService(Class<T> serviceInterface). Maybe we should remove methods like Cache.getQueryService, Cache.getGatewaySenders, etc. in favor of not hardcoding all of the services on Cache?
How to enforce dependencies/interface
Probably not every package list here should be it's own gradle module. How will will enforce the dependencies and the use of the package interfaces?
Package naming scheme
We still have a mix of two different conventions for where to put internal classes. Some things are in gemfire.internal.cache, and some things are in packages like cache.asyncqueue.internal. We should settle on one convention.
Interface and concrete class naming scheme
We seem to have a few conventions that are sometimes used. We should agree on what conventions we want to stick to.
- Having a public interface Cache and an internal interface InternalCache.
- Naming the implementation of an interface *Impl
2 Comments
Avinash
+1 for this effort.
For Injection please see https://github.com/google/guice makes sense.
Is there any plans to make Region and internal storage pluggable. Currently Geode stores everything in CustomEntryConcurrentHashMap.
May be I want to store data in Bucket sorted, which I cannot do write now, in order to change this map to say ConcurrentSkipListMap, I will have to go through lot of code changes in the core.
Dan Smith
Hi Avinash Lakshman
This proposal is talking about a bit bigger pieces than down at that level. But having an extension point to override the underlying map used by the region sounds like a good idea.
Using a DI container also sounds like a good idea for coupling all of these components together.