Certain maintenance activities can cause a higher CPU usage than usual : tar compaction, datastore garbage collection, online backup, tree activation, deployment of an application update causing caches to be flushed, ...
A java level deadlock can cause such situation. In this case, take a few thread dumps, raise a support ticket and restart the AEM instance.
Video gem session available at http://dev.day.com/content/ddc/en/gems/cq-aem-5-6-troubleshooting.html
Use http://localhost:4502/system/console/profiler for at least a few minutes during the period of slowness or high CPU usage. The output helps you determine which JVM threads are consuming most CPU cycles, and their associated packages and classes.
Use the simple CPU profiling tool that is included in CRX 2.0.x. To start it, open
To help analyzing the problem, create a few full thread dumps. Those thread dumps can then be analyzed. Creating Full Thread Dumps
To get the process id of your Java process, use
If this doesn't work (path not set, JDK not installed, or older Java version), use
ps -el | grep java
Full Thread Dumps
To analyze a performance problem or a blocked process, create about ten full thread dumps with about one-second delay. If the problem could be related to clustering, create at least ten full thread dumps on each cluster node. If possible, the thread dumps should be created at roughly the same time (it doesn't need to be exact).
A full thread dump is starting with this information as example:
Full thread dump Java HotSpot(TM) 64-Bit Server VM (24.65-b04):
"Thread-76273" daemon prio=3 tid=0x111061 nid=0x111061 running [0x111061]
... stack and locked object MUST be present
If your thread dump doesn't look like above, then it will not be possible to make proper investigations.
You can use the "tool" provided on the package share as described on the page. It provide a Thread dump tool that allow you to take multiple thread dumps, it will dump in the above format.
Alternatively, if installed, use jstack. This command prints the thread dumps to system out:
This command appends a full thread dump to a file:
jstack <pid> >> threadDumpNode1.txt
On some systems you may have to use: sudo -u aem jstack -J-d64 -l <pid>
If this doesn't work, use kill -QUIT. This command prints the thread dumps to the log file:
kill -QUIT <pid>
If there are no thread dumps in the standard output that last command, maybe add this to the java parameters:
-XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=jvm.log
Note: If the steps above steps for obtaining thread dumps do not work in your environment, then see this article.
To analyze the problem, it is important to know if CRX /CQ is running in an endless loop, or if it's merely sleeping. To find it out, type
This command gets you the list of processes, sorted by CPU usage. If the top process is a "Java process, and if the PID matches CRX/CQ, then the process is running full speed.
If you are unsure how to interpret the results, run the following statement and then include the file top.txt in your problem report:
top -l5 -s5 > top.txt
In many cases the problem is the number of open sessions it too large. At some point, it slows down processing. To find out if this is the case, run
jps -l (to get the process id of the Java process)
jmap -histo <pid> | grep CRXSession (to get the number of open sessions)
If this is, in fact, the problem (the number is higher than a few hundred sessions) then it needs to be analyzed. Possibly a session pool is used (depending on the version of CRX / CQ there could be a hot fix for the given problem), or an internal (possibly application level) cache references sessions. To analyze where those sessions are opened, see the 'Analyze Unclosed Sessions' page.
The CRX process should never be killed, also not when stopping takes too long. If you need to kill a process that is not responding, create a full thread dump first and log a bug.
If you do kill the CRX process, the next time you start it up the Tar PM can create backup_.tar files.
Use the Thread Dump Collection and Analysis tool to take thread dumps from a running CQ instance for troubleshooting the following:
- lock contention
- other thread-related issues