To be Reviewed By: 22 April 2022
Authors: Mario Ivanac
Status: Draft | Discussion | Active | Dropped | Superseded
Superseded by: N/A
Related: N/A
Problem
In case we are using persistent regions, and we are using asynchronous disk writes, currently in geode there is no way to monitor possible problems with disk writes.
What has been observed, if we have slower disk writes, or halted writing, events would be queued in asyncQueue, without any indication to user.
If we have consistent problem then queue can be fill up OOM exception occurs.
Anti-Goals
/
Solution
Solution would be to extend existing thread monitoring to monitor async writer thread, and report warning level alert in case thread is stack more then 15 seconds.
New solution: https://github.com/apache/geode/pull/7667
Old draft proposal of implementation: https://github.com/apache/geode/pull/7574
Changes and Additions to Public Interfaces
NA
Performance Impact
NA
Backwards Compatibility and Upgrade Path
No impacts.
Prior Art
NA
FAQ
Answers to questions you’ve commonly been asked after requesting comments for this proposal.
Errata
What are minor adjustments that had to be made to the proposal since it was approved?
6 Comments
Anthony Baker
Is there any performance impact to the async write thread?
If not this seems reasonable. However, I was under the impression that the async queue was limited and that if the queue filled up the behavior would be just the same as if the default sync mode was unable to write–the update would be blocked. Have you seen OOM occur due to slow disk behavior? If so, perhaps this is an implementation bug to be corrected (or maybe I am wrong).
Mario Ivanac
Minor or no performance impacts are expected. For more details you can see proposal in attached PR.
Regarding async queue limit, if it is defined (when creating disk-store, specify queue-size), then we have behavior like you described. But if limit is not defined (default value is 0), then queue will fill until OOM.
Darrel Schneider
Geode already has a service that will monitor its threads and report them as stuck. It seems like you should use this service instead of implementing a new way of doing it. For the async disk writer threads, I think you should follow the pattern of the long lived p2p reader threads. At times these threads can be waiting for requests to do things and at those "idle" times you don't want to report them as stuck. So you want to have a org.apache.geode.internal.monitoring.executor.SuspendableExecutor that you can call "suspendMonitoring" on while the thread is idle, and then "resumeMonitoring" once it receives some work to do (I think in your case this is when waitUntilFlushIsReady() returns true and the "drainCount" > 0). I think you can just follow the pattern of P2PReaderExecutorGroup, but in your case it would be DiskStoreFlusherExecutorGroup.
Darrel Schneider
Something else worth mentioning is that geode does have a DiskStore.queueSize statistic that can be used to monitor how many items are waiting to be async written to disk.
Mario Ivanac
Thanks, I will continue with your suggestion. Should we keep same level of alerting, as used in thread monitoring (warning level) or we could raise it to fatal.
Darrel Schneider
Basically you should just be using the existing thread monitor. The only change is to make it aware of the disk store flusher thread. So I think you should let the thread monitor do the alerting. I think warning is correct in this case because it may just be a transient slow thread as opposed to a truly hung thread. Fatal should be reserved for cases in which we know the member is going to be terminated for some reason.