CRX2Oak is a tool designed to migrate data between different repositories.
It can be used to migrate data from older CQ versions based on Apache Jackrabbit 2 to Oak, and it can also be used to copy data between Oak repositories.
You can download the newest version of crx2oak from the public Adobe repository at this location:
The list of changes and fixes for the newest version can be found in the CRX2Oak Release Notes.
For more information on Apache Oak and key concepts of AEM persistance, see Introduction to the AEM Platform.
The tool can be used for:
- Migrating from older CQ 5 versions to AEM 6
- Copying data between multiple Oak repositories
- Converting data between different Oak MicroKernel implementations.
Support for migrating repositories using external Blob Stores (commonly known as Data Stores) is provided in different combinations. One possible migration path is from a CRX2 repository that is using an external FileDataStore to an Oak repository using a S3DataStore.
The below diagram illustrates all the possible migration combinations supported by CRX2Oak:
CRX2Oak is called automatically during upgrades to AEM 6.1, but it can also be run manually in case it requires more customization.
With the default settings, only the Node Store will be migrated and the new repository will reuse the old binary storage.
Custom Java logic cand also be implemented using CommitHooks. Custom RepositoryInitializer classes can be implemented in order to initialize the repository with custom values.
CRX2Oak also supports memory mapped operations via the --mmap parameter. Memory mapping greatly improves performance and should be used whenever possible.
By default, the tool migrates the whole repository under the "/" path. However, you have complete control over which content should be migrated.
If there is any part of the content that is not required on the new instance, you can use the --exclude-path parameter to exclude the content and optimize the upgrade procedure.
If data needs to be copied between two repositories and you have a content path that is different on both instances, you can define it in the --merge-path parameter. Once you do, CRX2Oak will copy only the new nodes to the destination repository and will keep the old ones in place.
By default, AEM will create a version of each node or page that gets modified, and store it in the repository. The versions can be then used to restore the page to an earlier state.
However, these versions are never purged even if the original page is deleted. When dealing with repositories that have been in operation for a long time, the migration might need to process a lot of redundant data caused by orphaned versions.
A useful feature for these types of situations is the addition of the --copy-versions parameter. It can be used to skip the version nodes during migration or copy of a repository.
You can also choose whether to copy orphaned versions by adding --copy-orphaned-versions=true.
Both parameters also support a YYYY-MM-DD date format, in case you want to copy versions no later than a specific date.
An open source version of CRX2Oak is available in the form of oak-upgrade. It supports all the features except for CRX2 support. See the Apache Documentation for more information.
- --cache: Cache size in MB (default is 256)
- --mmap: Enable memory mapped file access for Segment Store
- --src-password: Password for the source RDB database
- --src-user: User for the source RDB
- --user: User for the targed RDB
- --password: Password for the target RDB.
- --early-shutdown: Shuts down the source JCR2 repository after nodes are copied and before the commit hooks are applied
- --fail-on-error: Forces a failure of the migration if the nodes cannot be read from the source repository.
- --skip-on-error: If an error occurs during the migration, no exception will be thrown. Instead, a warning will be logged. The faulting node will be skipped and the primary type of the parent will be set to nt:unstructured to avoid constraint violations in the target repository.
- --ldap: Migrates LDAP users from a CQ 5.x instance to an Oak based one. In order for this to work, the Identity Provider in the Oak configuration needs to be named ldap. For more information, see the LDAP documentation.
- --ldap-config: Use this in conjunction with the --ldap parameter for CQ 5.x repositories that used multiple LDAP servers for authentication. You can use it to point to the CQ 5.x ldap_login.conf or jaas.conf configuration files. The format is --ldapconfig=path/to/ldap_login.conf.
- --copy-orphaned-versions: Skips copying orphaned versions. Parameters supported are: true, false and yyyy-mm-dd. Defaults to true.
- --copy-versions: Copies the version storage. Parameters: true, false, yyyy-mm-dd. Defaults to true.
- --include-paths: Comma-separated list of paths to include during copy
- --merge-paths: Comma-separated list of paths to merge during copy
- --exclude-paths: Comma-separated list of paths to exclude during copy.
- --src-datastore: The datastore directory to be used as a source FileDataStore
- --src-fileblobstore: The datastore directory to be used as a source FileBlobStore
- --src-s3datastore: The datastore directory to be used for the source S3DataStore
- --src-s3config: The configuration file for the source S3DataStore.
- --datastore: The datastore directory to be used as a target FileDataStore
- --fileblobstore: The datastore directory to be used as a target FileBlobStore
- --s3datastore: The datastore directory to be used for the target S3DataStore
- --s3config: The configuration file for the target S3DataStore.
You can also enable debug information for the migration process in order to troubleshoot any issues that might appear during the process. You can do this by:
When migrating to a MongoDB replica set, make sure you set the WriteConcern parameter to 2 on all connections to the Mongo databases.
You can do this by adding the w=2 parameter at the end of the connection string, like this:
java -Xmx4092m -XX:MaxPermSize=1024m -jar crx2oak.jar crx-quickstart/repository/ mongodb://localhost:27017/aem-author?replicaset=replica1&w=2
For more information, see the MongoDB Connection String documentation on Write Concerns.