When an application becomes CPU-bound, it no longer processes interrupts efficiently.
One way to determine whether a machine or process is CPU-bound is to use an operating system command such as
top while the application is running.
vmstat output below shows that the CPUs are 99% idle.
top output shows, among other things, CPU usage percentage. The output below shows that the CPUs are mostly in use (idle=3.0%) and that Java processes using most of that CPU.
Another way to determine whether a machine or process is CPU-bound is to use
vsd to display active CPU and process CPU time values contained in a given Geode statistics archive.
The chart below shows LinuxSystemStats cpuActive values. This machine is CPU-bound.
The chart below shows VMStats processCpuTime values.
gfsh show metrics command can be used to show the active CPU (cpuUsage) of a member. An example is:
Determining that there is a CPU issue is one thing. Finding the source of the issue is another. One thing that can be done is to dump the JVM threads using the operating system
kill -3 command as shown below. These dumps will show you how many threads there are and what each thread is doing. Often, application issues can be found by examining these thread dumps.
If the thread dumps show mostly active Geode threads, one action would be to reduce the number of processing threads by one or more of the following methods:
- By setting or reducing the maximum number of threads (max-threads) processing client requests
- By reducing the number of Geode selector threads
- By setting the number of sockets between processes to 1 (conserve-sockets=true)
- By reducing the maximum number of Geode threads processing distribution messages by setting one or more of the following Java system properties (although these properties should only be set on the recommendation of Geode Support):
- DistributionManager.MAX_THREADS – the maximum size of the standard thread pool (default=100)
- DistributionManager.MAX_PR_THREADS – the maximum size of the partitioned thread pool (default= CPUs*4 or 16, whichever is greater)
- DistributionManager.MAX_FE_THREAADS – the maximum size of the function execution thread pool (default=CPUs*4 or 16, whichever is greater)
If these actions fail, the load can be reduced by distributing processing among additional CPUs by adding more machines. One thing you should not do is to add more Java VMs to the existing machines. This action will most likely exacerbate the situation.