There can be many causes:
- Large number of writes to the repository in the last 24 hours.
- MongoDB performance issues.
When Revision Garbage Collection is caused by excessive writes to the Oak repository, you can debug it by doing the following:
-
Run this bash script against the log file to get counts for the updates that were cleaned up:cut -d ':' -
-f2,3 revisiongc.log | grep '/' | cut -f1-7 -d '/' | sed 's|\(/etc/workflow/.*/2015-09-[0-9]*\)[_0-9]*\(/.*\)_[0-9]*|\1\2|' | sed 's|\(/var/replication/data/[a-z0-9\-]*\).*|\1|' | sort | uniq -c | sort -nr | less
Example output (shows how many items were deleted under various paths):
1574323 /oak:index/slingeventEventId/
140203 /oak:index/slingResourceType/
130687 /oak:index/nodetype/
130557 /oak:index/event.job.topic/
37277 /oak:index/reference/
35870 /var/replication/data/4e8f3d96-c010-4d2c-bf7b-431b902880d2
...
-
If Revision GC is slow and you don't find that there is excessive write activity against Oak indexes or other locations in the system, then you can investigate MongoDB performance. Revision GC always runs against the primary MongoDB replica so you could investigate the query times and indexes used in the mongod.log.