API Changes
Jags proposed a new no-arg createClientRegionFactory method on ClientCache.
This was to allow easy creation of proxy regions on a client like so:
Region r = c.createClientRegionFactory().create("customers");
We could do this but even an even easier way to create a proxy region would be:
Region r = c.getRegion("customers");
Currently getRegion on a ClientCache will return null if it has not already been created
on the client. We would change it to only return null if the region does not exist on any of the default pool's servers. To support multiple pools we would add a getRegion(String, Pool) on ClientCache.
If a path is given that names multiple regions, for example "root/a/b" then any of those regions that do not exist on the client will be created as proxy regions on the client as long as they exist on the server.
Since getRegion(String) is an existing method if we felt we needed to maintain strict backwards compatibility we could instead add new methods named findRegion(String) and findRegion(String, Pool). But I think changing getRegion to check the server side state is consistent with our other changes. However another thing to think about is that getRegion will also find locally defined regions and return those. So you end up with it either returning a reference to a local region OR implicitly creating one or more proxy regions. So I think it might be better to leave getRegion alone and add a new createProxy(String) and createProxy(String, Pool). It might be better to allow this method to create multiple proxies with the same name from the same client. This would give multi-threaded clients isolation from each other. But maybe they should just be able to each have their own cache. Also it wouldn't need to implicitly create all the parent regions. It could instead just create an orphan (i.e. has no parent in the client) and if you ever ask for the parent region it will at that time create a new proxy for the parent region on the server. This also would solve an old problem of how to allow clients that do have regions that cache state locally do by-pass that state and do something on the server. They could just call createProxy do the op and then throw that proxy away.
Jags also proposed adding a get method to ClientRegionFactory. I think it would be better to just change the already existing create method to do what he wants the get method to do which is:
This operation will always succeed if the region "customers" is defined on at least one of the servers in the distributed system. It will not matter if the region is available on the server that the client connects to. The server will automatically proxy all requests to the appropriate server(s).
This will be a change for our current clients. They can currently create a client region that does not exist on the server (yet). This change would require that the servers are up and their regions are available before the client starts creating its regions.
However Jags also proposed that a create on a client would automatically create the region on the server. I think this is going to far and we should provide a different way to create regions from a client on the server side.
We should also consider what "rootRegions" should do. Should we have a rootProxies method? This would give a way for clients to discover what regions exist on the server. Once they get the root proxies they can just call getSubregions and it would create proxies. We need to discuss how many (if any) of our existing methods should implicitly create a proxy on a client.
The methods that could do this are:
- RegionService#getRegion
- RegionService#rootRegions
- Region#getParentRegion
- Region#getSubregion
- Region#subregions
We will deprecate:
- Region#keySetOnServer
- Region#containsKeyOnServer
subregion creation
Currently no API exists that lets you create a subregion using ClientRegionFactory or RegionFactory.
The createSubregion methods on Region all take the old RegionAttributes which is created using the deprecated AttributesFactory.
Perhaps the best way to support this is by enhancing the create method on ClientRegionFactory and RegionFactory. Currently it only supports a simple name but this could be extended to also support a region path. Internally we would use that path (minus the last element) to lookup the existing parent region and then call createSubregion on it. Another alternative is to have a new flavor of create that takes both a Region and a String and it would create a subregion on the given Region. This second alternative is best since it makes clear that the parent region must exist.
creating regions on a server
GFSH and the JMX apis will support creating regions on the server. We could also support this feature from a client using some non-management apis. However it might be better not to; we could instead force them to do this type of thing from the command line.
If we do support it using client APIs then we could add a createServerRegionFactory(RegionShortcut) method to ClientCache. It would just return a RegionFactory. We should have another flavor of this method that takes the name of the server group(s) to create the region on. The only tricky thing I can think of here are how we would handle plug-ins like CacheListener. The plug-ins will need to be serializable in this context. We might even need to be able to "serialize" them to gfsh format if we want to store our configuration in gfsh syntax.
Also will clients want to be able to create other things on the server? Like a disk-store, a gateway, a function. The nice thing about supporting regions is we could deprecate DynamicRegionFactory (but gfsh might allow us to also do that). Perhaps we could allow a client to have an API that allows it to execute a gfsh command by sending it to the server using the clients connection.
DynamicRegionFactory allows the region creation to go from a client to a server back to another client and over a gateway to another ds. Do we need similar support? Can gfsh region creates (and config changes) flow across a gateway? Can they flow to clients?
Behavior Changes
Proxy regions currently perform operations locally. This will be changed to always send the operations to the server. This will change the following operations:
- Region#getParentRegion maybe
- Region#getAttributes maybe
- Region#getAttributesMutator maybe
- Region#getStatistics maybe
- Region#localInvalidateRegion noop
- Region#localDestroyRegion noop
- Region#close maybe
- Region#saveSnapshot this might already work
- Region#loadSnapshot this might already work
- Region#getSubregion maybe
- Region#createSubregion maybe
- Region#subregions maybe
- Region#getEntry
- Region#create (will now fail it exists on server)
- Region#put (will now return old value)
- Region#localInvalidate noop
- Region#localDestroy noop
- Region#keySet
- Region#values
- Region#entrySet
- Region#getUserAttribute maybe
- Region#setUserAttribute maybe
- Region#isDestroyed
- Region#containsValueForKey
- Region#containsKey
- Region#getRegionDistributedLock maybe
- Region#getDistributedLock maybe
- Region#becomeLockGrantor
- Region#localClear noop
- Region#containsValue
- Region#isEmpty
- Region#size
- Region#putIfAbsent
- Region#remove(key, value)
- Region#replace
We should discuss if the "local" methods are of any use to a client since it is unclear
what server they will be performed on.
Currently operations that are sent to a server must go to the correct server that owns a resource. This will be changed so that the server that a client sends an operation to will automatically forward it to its peer (who may not even have a cache server running) that owns the resource. This will be done for all region ops and queries.
Notes
DynamicRegionFactory allows the region creation to go from a client to a server back to another client and over a gateway to another ds. Do we need similar support? Can gfsh region creates (and config changes) flow across a gateway? Can they flow to clients?
We should never automatically create regions on clients and across gateways. What we need is an API to create a region on a server group from a client.
Additionally, we ought to consider having an API that allows a client or a gateway endpoint to subscribe to "control events" happening in a server.
A control event could be a membership change in the system, it could be a schema update (region creation message for example or a region destruction message for that matter). Subscribers will receive those events and be able to attach a callback that allows them to process the event. Control events that I can think of off the top of my head include
1> New WAN site has come online/WAN site is unreachable
2> Region creation/ region destruction
3> Peer joined/ Peer left
4> Client joined/ Client left
5> Required role available/ Required role unavailable
6> Locators available/ System running without any locators.
7> User configurable elements (like exchange connections or feeds being available in the system or not)
8> WBCL endpoint available/unavailable
A client or a peer can subscribe to these changes for a single DS or multiple DSes, where each DS would be identified by a set of locators/DS name.
Having a framework for these kinds of control messages would allow customers to programmatically switch to alternate mechanisms when they are using GemFire as a look aside cache or activate backup modes or throttle their clients to keep the system running at degraded QoS.
When you consider making changes to Proxy regions so that events are sent to the server, please work with Vishal and Hitesh to understand how the C#/C++ API currently does this, because we have thw ability to provide a local view of the region which then makes it very clear that the operations do not leave the process. By default, we operate in "tethered" mode. The local view gives us the ability to run the product untethered from the rest of the system.