Revision Clean Up task is running too slow on AEM + Oak/Mongo Cluster
AEM's Oak repository is configured to use MongoDB and the nightly maintenance task, Revision Clean Up is taking over five hours to complete.

Cause

There can be many causes:

  1. Large number of writes to the repository in the last 24 hours.
  2. MongoDB performance issues.

Resolution

When Revision Garbage Collection is caused by excessive writes to the Oak repository, you can debug it by doing the following:

  1. Go to http://aem-host:port/system/console/slinglog and log in as admin.

  2. Add a logger.

    • Log File: revisiongc.log
    • Log Level: Debug
    • Loggers: org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector
  3. Wait until the next slow Revision Clean Up event happens.

  4. Run this bash script against the log file to get counts for the updates that were cleaned up:cut -d ':' -

    -f2,3 revisiongc.log | grep '/' | cut -f1-7 -d '/' | sed 's|\(/etc/workflow/.*/2015-09-[0-9]*\)[_0-9]*\(/.*\)_[0-9]*|\1\2|' | sed 's|\(/var/replication/data/[a-z0-9\-]*\).*|\1|' | sort | uniq -c | sort -nr | less

    Example output (shows how many items were deleted under various paths):

    1574323 /oak:index/slingeventEventId/
     140203 /oak:index/slingResourceType/
     130687 /oak:index/nodetype/
     130557 /oak:index/event.job.topic/
      37277 /oak:index/reference/
      35870 /var/replication/data/4e8f3d96-c010-4d2c-bf7b-431b902880d2
    ...

  5. With this data, you can see if there are indexes getting updated too often that can perhaps be removed. This gives you an idea of where the most transient updates happen in the repository.  With this information, you can work on reducing the number of excessive writes to Oak.

  6. If Revision GC is slow and you don't find that there is excessive write activity against Oak indexes or other locations in the system, then you can investigate MongoDB performance.  Revision GC always runs against the primary MongoDB replica so you could investigate the query times and indexes used in the mongod.log.