Skip to end of metadata
Go to start of metadata

Overview

Apache geode provided following two types of persistence models.

  1. persisting on local disk (shared nothing architecture each member writes to it own disk).
  2. persistence on HDFS (removed from current develop branch).

But in the current Apache Geode architecture there is no support for plugin any other custom storage for persistence of cold data and keep operational data in memory. 
There are no direct interfaces provided which you can implement for your store and that store will be used for persisting/retrieving data.
We are trying to create a pluggable storage framework wherein you can attach any kind of storage, write/read in any format based on your storage.
Consistency and availability is managed by attached store and not by Apache Geode. 

Proposal Details:

We are proposing a pluggable storage interface which  can be implemented by any application and they will be able to push and retrieve the data from that store.
Following major interfaces we are adding for pluggability.

  1. CacheStoreManager : Manages all the cache stores. Creation/Removal/modification to the cache store is done by CacheStoreManager.

    CacheStoreManager
  2. CacheStore:  

    CacheStore
  3. CacheStoreFactory : Used for creating cache store

    CacheStoreFactory

 

Usage:
CacheStore creation and attaching the cache store to the region is done as follows

CacheStore creation

 

Implementation details:

CacheStore pluggabilitry is achieved using readthrough and writebehind concepts in geode. 
In geode we can create a asynceventlistener and attach that to the region. We are using the same asyncevent queue framework for persistence of data in cache store.
For retrieval of data, if there is miss in the cache then we are retrieving it from cache store.

Sequence diagram for put/get with attached cachestore

Limitations:

  1. CacheStores are supported only for partitioned regions.

 

 

  • No labels

12 Comments

  1. For some reason, i can't see the sequence diagram...Does anyone has similar issues...

    1. Seems like the sequence diagram is linking to a file on the amlpool wiki?

  2. I am assuming that one could configure a region with both disk-store (persistence/overflow) plus cache-store; data in Geode for real-time analytics and in cache-store for batch analytics (or for analytics for large data). Is that right?

    If region is configured with cache-store; will the data (key/value) be stored both in region (memory) and cache-store or only in cache-store? If its stored both in region and cache-store, why is CacheStore.get() needed?

    Can multiple cache-store be attached to Region (Data replication to both HBase, Cassandra, others...)?

    When is CacheStore.alter() is called/invoked? I am thinking in-line with CacheStore.destory() which i am assuming is called when Region is destroyed...

    Is region operation blocked when CacheStore.initialize() is in progress?

    What happens to the events in AsycListener when bucket-region fails over (secondary bucket becoming primary); does cache-store knows about duplicate events?

    What happens to the region entry when CacheStore.persists() throws exception...

    Do you see any requirement to support data co-location between GemFire nodes and CacheStore nodes (to be on same VM)...

    Looking at the API, i am assuming OQL query and indexing is not supported with Cache-store regions...

    If data changes in cache-store (e.g. hbase), how is it reflected in Geode region (not a requirement?)...

    Instead of create/destroy cache-store, will it be more meaningful to have attach/detach cache-store? As cache-stores (HBase) are already setup and running...

     

     

     

  3. > I am assuming that one could configure a region with both disk-store (persistence/overflow) plus cache-store; data in Geode for real-time analytics and in cache-store for batch analytics (or for analytics for large data). Is that right?

    If we consider the possibility of write-only cache stores (stores not for recovery but for export of data for batch analytics or some other purpose, then yes that mode of operation would make sense. In this case changes would have to be made to support the concept of multiple simultaneous stores for a region. The concept of write-only cache stores and multiple stores per region are questions we should answer in scoping the requirements (will add them above).


    > If region is configured with cache-store; will the data (key/value) be stored both in region (memory) and cache-store or only in cache-store? If its stored both in region and cache-store, why is CacheStore.get() needed?

    Cache store operates as persistent store in Geode today and was intended to expand the possible types (as your question above indicates this does open up possible new use cases). Given this the key-value was assumed to be always stored in the persistent store in the manner configured (synchronous or asynchronous). This was based in the requirement that the data necessary to re-create the region from persistent store for the base use case. However, if multiple stores or write-only stores are accepted as requirements it opens up the more general question of cache store filters that could filter out not only keys but also filter based on some criteria on the key and/or value.

     

    > Can multiple cache-store be attached to Region (Data replication to both HBase, Cassandra, others...)?

    We did not consider this requirement initially but will add it to the open questions.

     

    > When is CacheStore.alter() is called/invoked? I am thinking in-line with CacheStore.destory() which i am assuming is called when Region is destroyed...

    Shankar Hundekar would you answer this question?

     

    > Is region operation blocked when CacheStore.initialize() is in progress?

    Generally we would expect this to be set during region creation and hence yes; if addition of cache stores during region operation is a requirement then it would have to be considered as to the behavior, but we are currently assuming cache store addition requires a restart. If we decide to allow multiple stores adding stores (or at least write-only stores) dynamically may be desirable, but we did not consider it as an initial requirement.

     

    > What happens to the events in AsycListener when bucket-region fails over (secondary bucket becoming primary); does cache-store knows about duplicate events?

    In the typical Geode use case non-primary buckets are writing to persistence as well to maintain copies on disk as well so this works currently as there are no duplicate events per se. The case where this becomes a consideration is when you are writing to a distributed cache store, say HDFS, In that case only the primary is writing as the distributed store takes care of replication, and non-primaries are not writing. There are two considerations we assumed here for transition to a new primary: (1) events should not be missed; (2) a few extra copies (due to multiple writes to distributed store for an event) during transition is tolerable.  In this case a last-N entries queue could be maintained and that queue used to initialize the async write queue on transition.

     

    > What happens to the region entry when CacheStore.persists() throws exception...

    Good point, we were assuming basically the same behavior as in Geode for the case when the write to persistent store fails, but in some cases the behavior may depend on the purpose of the cache store. Makes sense to possibly make this behavior configurable (AFAIK it is not currently, but did not go to docs to check, and Geode always surprises me with its flexibility (wink))

     

    > Do you see any requirement to support data co-location between GemFire nodes and CacheStore nodes (to be on same VM)...

    Originally, no. How the actual cache store is deployed was considered beyond scope and dependent on user performance, scalability, and system requirements. One basic requirement is that for a multi-copy distributed type cache store (e.g. HDFS) only one copy is written to the store (no N X M duplication). Distributed cache stores should expose information on their configured replication factor. 

     

    > Looking at the API, i am assuming OQL query and indexing is not supported with Cache-store regions..

    This was not assumed for initial implementation, though we are now considering scan with filter type support that could provide a base for such funcitonality. In addition some cache stores may natively support a query interface and could support OQL. This is desirable to support and should but perhaps not in the first cut.

     

    > If data changes in cache-store (e.g. hbase), how is it reflected in Geode region (not a requirement?)...

    Some cache stores may support data change notifications, and an interface to register for these could be provided. Will add to the requirements open questions.

     

    > Instead of create/destroy cache-store, will it be more meaningful to have attach/detach cache-store? As cache-stores (HBase) are already setup and running...

    Attach/detach makes sense, as long as it is consistent with other Geode naming practices.

     

    Great comments, thanks!

    We can keep an open questions on requirements list and then when it has stabilized get a sense from the community on what the scope of the initial implementation should be. 

     

  4. Thanks anilkumar for the review comments and Robert Geiger for detailed explanation.

    > When is CacheStore.alter() is called/invoked? I am thinking in-line with CacheStore.destory() which i am assuming is called when Region is destroyed...

    Cache store alter is provided to change the cache store configurations either through gfsh command or programmaticaly via CacheStoreManager. 
  5. I have created a ticket to support eviction/expiration operation with AsyncEventQueue...This may be needed for plug-able cache-store.

    GEODE-1209 - Support/Propagating eviction and expiration operation with AsycEventQueue. Resolved

     

  6. We should also consider supporting batch get/updates; if external cache-store supports batch operations, this will help to take advantage of that...

    Similar to what we have:

    getAll();

    putAll();

    removeAll();

    The API's take Region as parameter; is there any reason to pass Region, can we just pass the region-name...this will reduce any contention by taking region locks in cache-store APIs.

     

    1. Yes we are also thinking of adding batching at cachestore level.  We are passing region instance as we may need to access some region attributes in the CacheStore. 

  7. This seems like a simpler API for configuring geode as cache wrapping a separate data source, compared to using a combination of cache writer/AEQ and CacheLoader. Is this API trying to satisfy any use cases that cannot be solved using a combination writer and loader? It sounds like maybe for the first cut no, but further down the line the API may add support for passing queries through? Any thoughts on how to extend this API in the future a backwards compatible way? By adding more methods to the CacheStore interface with default implementations?

    It seems like the CacheStoreConfig is probably specific to different types of CacheStores. Should that just be part of the factory? I'm assuming the example someone would have to create an instance of a specific factory, eg new HBaseCacheStoreFactory or something like that.

    How will this look in cache.xml and gfsh?

    Will methods like initialize/alter/destroyCacheStore be called on every node every time a node starts up, or just once? It would probably be good to clarify that in the javadocs.

    Should the configuration of the underlying AEQ be exposed? Should there be a synchronous option to use a CacheWriter instead of an async event queue?

    Should the persist method receive the callback argument as well? What about the other things on AsyncEvent, like the sequence id (used to detect duplicate events)?

    It seems slightly problematic that the CacheStoreManager.create returns a CacheStore. Methods like persist are supposed to be called by geode, not by the user. So maybe the instance of the CacheStore itself shouldn't be accessible to the geode user?

    1. Thanks Dan Smith for review.

      >>Is this API trying to satisfy any use cases that cannot be solved using a combination writer and loader?
      No, Not in the initial cut but we can add that in future versions.

      >>Any thoughts on how to extend this API in the future a backwards compatible way? By adding more methods to the CacheStore interface with default implementations?
      We can add more interfaces that extends the CacheStore interface. e.g if we are adding  batching support  in the cachestore then we can add the BatchCacheStore

      >>It seems like the CacheStoreConfig is probably specific to different types of CacheStores. Should that just be part of the factory? I'm assuming the example someone would have to create an instance of a specific factory, eg new HBaseCacheStoreFactory or something like that.

      We have added the CacheStoreConfig which every store can implement and add the store specific configurations. Also these configurations can be initialized using the resource files such as hbase-site.xml.

      >>How will this look in cache.xml and gfsh?

      We will also add the cache store in cache.xml. In the gfsh also we are adding commands like create-cachestore, list-cachestores, remove-cachestore etc.

      >>Will methods like initialize/alter/destroyCacheStore be called on every node every time a node starts up, or just once? It would probably be good to clarify that in the javadocs.

      These methods will be called on each node. Yes, will clarify more in the javadocs.

       

      >>Should the configuration of the underlying AEQ be exposed? Should there be a synchronous option to use a CacheWriter instead of an async event queue?

      Yes these configurations are exposed via default CacheStoreConfig implementation. In the initial cut we are supporting only AEQ but we can add the synchronous option to use CacheWriter in the next implementation.

      >>Should the persist method receive the callback argument as well? What about the other things on AsyncEvent, like the sequence id (used to detect duplicate events)?

      No we have not considered any callback argument. If there is any use case then we can add support for the callback.

      >>It seems slightly problematic that the CacheStoreManager.create returns a CacheStore. Methods like persist are supposed to be called by geode, not by the user. So maybe the instance of the CacheStore itself shouldn't be accessible to the geode user?

      Yes. I agree we will revisit this.

       

  8. (fixed the image; sorry about the oversight)

    If the interface support a basic set of common functions then it can be expended to support new use cases without breaking existing services. The idea also is for it to expose capabilities and stats in an extensible way so that newer code can look for what it needs, but older code will be able to see the same services provided.

    It is possible we can group stores with certain characteristics into groups of a given type, then define minimal functionality for each type. That will insure that code can expect a store of a given type will at least support a minimal set of functionality.

    > It seems slightly problematic that the CacheStoreManager.create returns a CacheStore. Methods like persist are supposed to be called by geode, not by the user. So maybe the instance of the CacheStore itself shouldn't be accessible to the geode user?

    Yes, this concern was raised here also by Milind, we will consider the issue in revising the proposal based on then input so far. We will also add in cache.xml and gfsh examples.