Analyze unusual repository growth


Disk usage is abnormally and rapidly increasing on an AEM server.


Many things can cause unusual increases in disk utilization.  Some potential causes:

  1. Proper maintenance hasn't been run on the system.  See this article for details on various system maintenance activities.
  2. AEM or the application is creating very large number of nodes or updates to node properties.  This could be due to a misconfiguration or an application code bug.  Since the tar storage in Oak operates in an append-only mode, repeated saving of nodes further contributes to excessive repository growth.
  3. Very large file(s) have been uploaded to AEM Assets or package manager.
  4. Debug or Trace logging was left enabled.

Analysis / Resolution

A. If AEM is running and there is ample disk space

1. Configure Oak write trace logging

If AEM is still running then we can enable a debug logger to tell us which repository paths are being written to.  To enable this logger, install the attached log configuration package below or follow these steps:
  1. Go to http://aemhost:port/system/console/slinglog
  2. Click Add new logger
  3. Configure a logger: Log File: logs/repgrowth.log, Log Level: trace, Loggers: org.apache.jackrabbit.oak.jcr.operations.writes


  • The log includes information regarding all writes and session details.  If you use this logger then make sure you have sufficient disk space.
  • Uninstall the log configuration package or remove the log configuration after a short period of having this enabled to avoid further disk space consumption.

2. Run the disk usage report

You can also leverage the Disk Usage report http://host:port/etc/reports/diskusage.html.This report displays the disk space used by repository path.  The report is drillable, allowing you to view subtrees as well.

3. Capture thread dumps and perform profiling

After using the repgrowth.log to get some idea of what data is being written, we can get information about what code is writing that data by capturing thread dumps and running CPU profiling.

See these articles:

B. If AEM is stopped and/or disk space is almost out

If you had to stop AEM to avoid disk space growth then use the commands below to do some initial analysis.

On Linux platform, leverage the du command to list all directories under crx-quickstart with the summarized size of those directories:

du -h --max-depth=2 crx-quickstart

Use find and du commands to find recently modified files and get their sizes:

find crx-quickstart -type f -mtime 1 -exec du -hs {} \; -print

To find large files in the datastore, you can combine find, du and file commands to find files over 100MB in the datastore directory and auto-identify their file type:

find crx-quickstart/repository/datastore -type f -size +100M -exec sh -c "du -hs \"{}\"; file \"{}\"" \;

If you find that the growth is occuring in the segmentstore directory then the command below can help give some clues as to what data is being written:

strings data_xxxxxx.tar | egrep '^.?/' | sed 's/.$//;s/^.\//\//'

Applies to

AEM6.x / Oak

לוגו של Adobe

כניסה לחשבון