Ensuring High Service Availability on AEM Instances
Last updated on May 16, 2021 04:39:09 AM GMT
Best Practices for Avoiding Production Outages
- AEM SItes
- AEM Assets: Tuning Guide.
Author/Publish instance is very slow Or High CPU usage
- Take at-least 10 thread dumps at an interval of 2-3 seconds using the jstack script
- Use the Take Thread dumps article for more details
- Use Thread dump analysis tools for checking threads with stack trace of more than 100 lines and CPU consuming threads:
High Memory usage on AEM instances
- Check the memory usage at 
- Generate heap dumps using article at  and share it with AEM Support for further analysis
High CPU usage after dispatcher cache clear
- You can define cache invalidation by using the "/invalidate" and "/statfileslevel"
- If you deny all for invalidation and with no /statfileslevel -> Only activated pages are deleted
- If you allow all for invalidation and /statfileslevel defined -> Only pages will get invalidated in the same folder where the stat file was updated
- If you allow all for invalidation and with no /statfileslevel -> All pages get invalidated wherever they are located under docroot
- After code deployments, try to recache the pages. Immediate recaching ensures that Dispatcher retrieves and caches the page only once, instead of once for each of the simultaneous client requests.
- Refer to the Optimizing Dispatcher Cache article for more in-depth insights.
Observed SegmentNotFound Exceptions in the logs
- Follow steps at Resolving Segmentnotfound
- If no good revision is found, try to find the corrupted nodes using the script mentioned in Part B of the above article.
- If corruption is found under any of the folders except /apps, please contact AEM Support team for further guidance.
RCA for AEM outage which resolved after restart
Share the following data with the AEM Support team to analyze RCA:
- Log files during the outage
- Thread dumps taken during the outage
- If available, Heap Dumps during the outage
Session leak in AEM
Check and analyze if JCR session leaks in your AEM instance