Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Troubleshooting CPU

Description

When an application becomes CPU-bound, it no longer processes interrupts efficiently.

Determination

One way to determine whether a machine or process is CPU-bound is to use an operating system command such as vmstat or top while the application is running.

Operating System Command

vmstat

The vmstat output below shows that the CPUs are 99% idle.

...

The top output shows, among other things, CPU usage percentage. The output below shows that the CPUs are mostly in use (idle=3.0%) and that Java processes using most of that CPU.

Code Block
	top
	 

...

 12:49:

...

24  up 113 days, 23:36, 35 users,

...

  load average: 10.40, 5.20, 2.30
	 615 processes: 587 sleeping, 27 running, 1 zombie, 0 stopped
	 CPU states:

...

  cpu    user    nice  system    irq  softirq  iowait    idle
	            total   61.7%    0.0%   31.4%   0.5%     2.5%    0.4%    3.0%
	 
	   PID USER  PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM 

...

  TIME CPU COMMAND
	 22523 user1  15 

...

 

...

 0 1102M 1.1G 18068 

...

R     3.6 14.

...

1   0:

...

24   1 java
	 22778 user1 

...

 15  

...

 0 1102M 1.1G 18068 R    

...

 2.1 14.

...

1   0:

...

02   1 java
	 22682 user1  15 

...

 

...

 0 1102M 1.1G 18068 R    

...

 1.4 14.

...

1   0:

...

07   1 java
	 22698 user1  15 

...

 

...

 0 1102M 1.1G 18068 R   

...

  1.4 14.

...

1   0:

...

10   0 java
	 19286 user1  

...

15  

...

 0 1100M 1.1G 18080 

...

R     0.5 14.

...

1   0:

...

25   0 java	
  

vsd

Another way to determine whether a machine or process is CPU-bound is to use vsd to display active CPU and process CPU time values contained in a given Geode statistics archive.

LinuxSystemStats

The chart below shows LinuxSystemStats cpuActive values. This machine is CPU-bound. LinuxSystemStats

VMStats

The chart below shows VMStats processCpuTime values. 

VMStats

gfsh

The gfsh show metrics command can be used to show the active CPU (cpuUsage) of a member. An example is:

...

show metrics --member=server1 --categories=member Member Metrics Category | Metric | Value -------- | --------------- | ----------------- member | upTime | 14 | cpuUsage | 84.13760375976562 | currentHeapSize | 2200 | maximumHeapSize | 81250 

Action

Determining that there is a CPU issue is one thing. Finding the source of the issue is another. One thing that can be done is to dump the JVM threads using the operating system kill -3 command as shown below. These dumps will show you how many threads there are and what each thread is doing. Often, application issues can be found by examining these thread dumps.

...