U bekijkt help-inhoud voor de versie::

Overview of Storage in AEM 6

One of the most important changes in AEM 6 are the innovations at the repository level.

Currently, there are two node storage implementations available in AEM6: Tar storage, and MongoDB storage.

Tar Storage

Running a freshly installed AEM instance with Tar Storage

By default, AEM 6 uses the Tar storage to store nodes and binaries, using the default configuration options. To manually configured its storage settings, follow the below procedure:

  1. Download the AEM 6 quickstart jar and place it in a new folder.

  2. Unpack AEM by running:

    java –jar cq-quickstart-6.jar -unpack

  3. Create a folder named crx-quickstart\install in the installation directory.

  4. Create a file called org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg in the newly created folder.

  5. Edit the file and set the configuration options. The following options are available for Segment Node Store, which is the basis of AEM's Tar storage implementation:

    • repository.home: Path to repository home under which various repository related data is stored. By default segment files would be stored under the crx-quickstart/segmentstore directory.
    • tarmk.size: Maximum size of a segment in MB. The default is 256MB.


  6. Start AEM.

Mongo Storage

Running a freshly installed AEM instance with Mongo Storage

AEM 6 can be configured to run with MongoDB storage by following the below procedure:

  1. Download the AEM 6 quickstart jar and place it into a new folder.

  2. Unpack AEM by running the following command:


    java –jar cq-quickstart-6.jar -unpack

  3. Make sure that MongoDB is installed and an instance of mongod is running. For more info, see Installing MongoDB.

  4. Create a folder named crx-quickstart\install in the installation directory.

  5. Configure the node store by creating a configuration file with the name of the configuration you want to use in the crx-quickstart\install directory.

    The Document Node Store (which is the basis for AEM's MongoDB storage implementation) uses a file called org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.cfg


  6. Edit the file and set your configuration options. The following options are available:


    • mongouri: The MongoURI required to connect to Mongo Database. The default is mongodb://localhost:27017
    • db: Name of the Mongo database. By default new AEM 6 installations use aem-author as the database name.
    • cache: The cache size in MB. This is distributed among various caches used in DocumentNodeStore. The default is 256
    • changesSize: Size in MB of capped collection used in Mongo for caching the diff output. The default is 256
    • customBlobStore: Boolean value indicating that a custom data store will be used. The default is false.
  7. Create a configuration file with the PID of the data store you wish to use and edit the file in order to set the configuration options. For more info, please see Configuring Node Stores and Data Stores.

  8. Start the AEM 6 jar with a MongoDB storage backend by running:

    java -jar cq-quickstart-6.jar -r crx3,crx3mongo

    Where -r is the backend runmode. In this example, it will start with MongoDB support.

Maintaining the Repository

As data is never overwritten in a tar file, the disk usage increases even when only updating existing data. To make up for the growing size of the repository, AEM employs a garbage collection mechanism called Revision Cleanup. The mechanism will reclaim disk space by removing obsolete data from the repository, and has three phases: estimation, compaction, cleanup. In the past the revision cleanup was often referenced as compaction.

The are two ways of performing revision cleanup:

  1. Offline Revision Cleanup
  2. Online Revision Cleanup

Offline revision cleanup is the recommended and supported way of performing revision cleanup. 

Choosing the Type of Revision Cleanup

For AEM 6.2 Publish instances

Offline revision cleanup is the recommended way of cleaning up revisions. This requires to shut down the instances in order to run offline revision cleanup during non business hours.

If downtimes are not possible, customers can contact Adobe Support to evaluate additional options:

  1. If there is more than one publish instance, one can be taken down for offline revision cleanup while avoiding replication from author. After a successful revision cleanup, the instance can be taken back into production while a clone of the clean instance would replace other remaining production ones.
  2. If the above is still not possible, online revision cleanup can be used under the terms and conditions of the program. This type of cleanup has restricted support in AEM 6.2.

For AEM 6.2 Author instances

Offline revision cleanup is the recommended way of cleanup for author instances as well. However, in rare cases where downtime is not possible either beacause maintenance windows were not foreseen and can have the same business impact as system outages, customers should contact Adobe Support to evaluate additional options. The additional options for performing cleanup on author instances are the same as the ones described above for publish instances.


For more information about the revision cleanup process, see the Frequently Asked Questions.

Performing Offline Revision Cleanup

Let op:

Different versions of the Oak-run tool need to be used depending on the Oak version you use with your AEM installation. Please check the version requirements list below before using the tool:

  • For Oak versions 1.0.0 through 1.0.11 or 1.1.0 through 1.1.6, use Oak-run version 1.0.11
  • For Oak versions newer than the above, use the version of Oak-run that matches the Oak core of your AEM installation.

Adobe provides a tool called Oak-run for performing revision cleanup. It can be downloaded at the following location:


The tool is a runnable jar that can be manually run to compact the repository. The process is called offline revision cleanup because the repository needs to be shut down in order to properly run the tool. Make sure to plan the cleanup in accordance with your maintenance window.

For tips on how to increase the performance of the cleanup process, see Increasing the Performance of Offline Revision Cleanup.



You can also clear old checkpoints before the maintenance takes place (steps 2 and 3 in the procedure below). This is recommended only for instances that have more than 100 checkpoints. 

The procedure to run the tool is:

  1. Always make sure you have a recent backup of the AEM instance.

    Shut down AEM.

  2. (Optional) Use the tool to find old checkpoints:

    java -jar oak-run.jar checkpoints install-folder/crx-quickstart/repository/segmentstore
  3. (Optional) Then, delete the unreferenced checkpoints:

    java -jar oak-run.jar checkpoints install-folder/crx-quickstart/repository/segmentstore rm-unreferenced
  4. Run the compaction and wait for it to complete:

    java -jar oak-run.jar compact install-folder/crx-quickstart/repository/segmentstore

Increasing the Performance of Offline Revision Cleanup

Since version 1.0.22, the oak-run tool introduces several features with an aim to increase the performance of the revision cleanup process and minimize the maintenance window as much as possible.

The list includes several command line parameters, as described below:

  • -Dtar.memoryMapped. Use this to enable memory mapped operations for tar file to greatly increase performance. You can set this as true or false. It is highly recommended you enable this feature in order to speed up compaction.
  • -Dupdate.limit. Defines the threshold for the flush of a temporary transaction to disk. The default value is 5000000.
  • -Dcompress-interval. Number of compaction map entries to keep until compressing the current map. The default is 1000000. You should increase this value to an even higher number for faster throughput, if enough heap memory is available.
  • -Dcompaction-progress-log. The number of compacted nodes that will be logged. The default value is 1500000, which means that the first 1500000 compacted nodes will be logged during the operation. Use this in conjunction with the next parameter documented below.
  • -Dlogback.configurationFile. Use a configuration file for logging. You can use the below configuration file to enable the logging of the nodes that are being compacted:
  • -Dtar.PersistCompactionMap. Set this parameter to true to use disk space instead of heap memory for compaction map persistance. Requires the oak-run tool versions 1.4 and higher. For further details also see question 6 in the FAQ section.


Let op:

Memory mapped file operations do not work correctly on some versions of Windows. Make sure that you use the tool without the -Dtar.memoryMapped parameter on Windows platforms, otherwise the revision cleanup will fail.

An example of the parameters in use:

java -Dtar.memoryMapped=true -Dupdate.limit=5000000 -Dcompress-interval=10000000 -Dcompaction-progress-log=1500000 -Dlogback.configurationFile=logback.xml -Xmx8g -jar oak-run-*.jar checkpoints <repository>


Use as much heap memory as possible for faster I/O operations. It is recommended you use at least eight gigabytes for most common deployments.

Performing Online Revision Cleanup

Let op:

Online Revision Cleanup is present in AEM 6.2 under restricted support. For more information on the conditions and terms of using the feature, please contact Adobe Customer Care.

Additional Methods of Triggering Revision Cleanup

Invoking Revision Garbage Collection via the JMX Console

  1. Open the JMX Console by going to http://localhost:4502/system/console/jmx

  2. Click the RevisionGarbageCollection MBean.

  3. In the next window, click startRevisionGC() and then Invoke to start the Revision Garbage Collection job.


Due to the mechanics of the garbage collection, the first run will actually add 256 MB of disk space. Subsequent runs will work as expected and start shrinking the repository size.

Performance Tuning and Maintenance Recommendations

Follow the below recommendations in order to maintain maximum efficiency while upkeeping the repository:

  1. Make sure you run Offline Revision Cleanup whenever possible during scheduled maintenance hours;
  2. If you are using an external data store, make sure you run Data Store Garbage Collection after revision cleanup has been completed.
  3. Follow the recommendations in this knowledgebase article for tips on improving the performance of your AEM instance.

Revision Cleanup Frequently Asked Questions

      1. When to use Offline Revision Cleanup as opposed to Online Revision Cleanup?

      2. How frequently should Offline Revision Cleanup be performed?

  • It depends on the repository growth rate. As a general rule of thumb, for average content repositories, it is recommended that you perform revision cleanup every 2 weeks for an author instance, and once per quarter for a publish instance.

      3. What are the factors that determine the duration of the Offline Revision Cleanup?

  • The repository size and the amount of revisions that need to be cleaned up determines the duration of the cleanup.

      4. What's the worst that can happen if you do not perform revision cleanup?

  • The AEM instance will run out of disk space, which will cause outages in production. It is highly recommended that you follow the monitoring best practices as mentioned in Maintenance and Monitoring.

      5. What is the difference between a revision and a page version?

  • Oak revision: Oak organizes all the content in a large tree hierarchy that consists of nodes and properties. Each snapshot or revision of this content tree is immutable, and changes to the tree are expressed as a sequence of new revisions. Typically, each content modification triggers a new revision. See also http://jackrabbit.apache.org/dev/ngp.html.
  • Page Version: Versioning creates a "snapshot" of a page at a specific point in time. Typically, a new version is created when a page is activated. For more information, see Working with Page Versions.

      6. How to speed up the Offline Revision Cleanup task if it does not complete within 8 hours ?