Tar Persistence Manager Optimization

Question / Problem

When and how to optimize the Tar PM?

Answer / Resolution

Optimizing Tar Files

As data is never overwritten in a tar file, the disk usage increases even when only updating existing data. When optimizing, the Tar Persistence Manager copies data that is still used from old tar files into new tar files, and deletes the old tar files that contain only old or redundant data.

If optimization is stopped before it is finished completely, then the next time when it is started it will continue where it left off (it doesn't start from the beginning).

If there is only one file, optimization will have no effect (no new file is created).

The disk space required to run the optimization is at most the size of one data tar file, which is 256 MB by default (for CRX 2.0; this setting can be changed using the parameter maxFileSize). This applies to both the shared directory as well as the local directory, meaning the total amount of temporary disk space used is at most 512 MB by default.

Automatic Scheduled Optimization

CRX automatically runs Tar PM optimization between 2 am and 5 am. If the automatic optimization is not finished at 5 am, then it will stop automatically. It will continue from there the next night (it doesn't start from the beginning).

To change the time when automatic optimization is run, use the Tar PM configuration option "autoOptimizeAt". Setting this value to "02:00" will trigger an optimization every day at two in the morning. In order to change the default time, edit repository/your_workspace/workspace.xml, as an example I set the optimization below to run at 1 a.m every day until 4 a.m the latest:

<PersistenceManager class="com.day.crx.persistence.tar.TarPersistenceManager">
    <param name="autoOptimizeAt" value="01:00-04:00" />

Disabling Automatic Scheduled Optimization

To disable the automatic optimization, set the value to "" (an empty string). This will work for CRX 2.1 and newer. For CRX versions up to 2.0, you need to set it to "-0" (which actually means 'stop optimization at midnight').

Manually Optimizing Tar Files using the CRX Explorer

To optimize tar files using the CRX console:

  • In the CRX Console, log in as administrator.
  • Click Repository Configuration.
  • Select Tar Persistence Manager Optimization and click Start Optimization.
  • To stop optimization while it is running, click Stop Optimization.

Note: In a clustered environment, this only works on the cluster nodes that is currently running as the master. Starting optimization on a slave cluster node has no effect.

Manually Optimizing Tar Files at Runtime

You can start optimizing the tar file manually at runtime by placing a specially named file optimize.tar in the folder where the tar files are. This file can be empty.

When optimization starts, this file is automatically renamed to optimizeNow.tar. If you need to stop optimization, you can do so by deleting this file. The file is automatically deleted when the optimization run ends.

Affected Versions

CRX 1.4.1 and 1.4.2, CRX 2.x