Ensuring High Service Availability on AEM Instances
Ostatnia aktualizacja May 20, 2021 11:58:22 PM GMT
Best Practices for Avoiding Production Outages
Author/Publish instance is very slow Or High CPU usage
- Take at-least 10 thread dumps at an interval of 2-3 seconds using the jstack script
- Use the Take Thread dumps article for more details
- Use Thread dump analysis tools for checking threads with stack trace of more than 100 lines and CPU consuming threads:
High Memory usage on AEM instances
- Check the memory usage at 
- Generate heap dumps using article at  and share it with AEM Support for further analysis
High CPU usage after dispatcher cache clear
- You can define cache invalidation by using the "/invalidate" and "/statfileslevel"
- If you deny all for invalidation and with no /statfileslevel -> Only activated pages are deleted
- If you allow all for invalidation and /statfileslevel defined -> Only pages will get invalidated in the same folder where the stat file was updated
- If you allow all for invalidation and with no /statfileslevel -> All pages get invalidated wherever they are located under docroot
- After code deployments, try to recache the pages. Immediate recaching ensures that Dispatcher retrieves and caches the page only once, instead of once for each of the simultaneous client requests.
- Refer to the Optimizing Dispatcher Cache article for more in-depth insights.
Observed SegmentNotFound Exceptions in the logs
- Follow steps at Resolving Segmentnotfound
- If no good revision is found, try to find the corrupted nodes using the script mentioned in Part B of the above article.
- If corruption is found under any of the folders except /apps, please contact AEM Support team for further guidance.
RCA for AEM outage which resolved after restart
Session leak in AEM
Check and analyze if JCR session leaks in your AEM instance