How to migrate the DataStore

Question

Due to various reasons (e.g. infrastructure consolidation), it might make sense to migrate existing binary data stored in one DataStore [1] to another. How can such a migration be performed?

Answer, Resolution

Currently, there is no tool available that does such a DataStore migration. However, with the help of a temporary CRX instance mounted via WebDAV [2], this can be easily achieved. The following example shows how to migrate binaries from a FileDataStore to a DbDataStore.

Please note: this procedure should be tested and validated on a test instance first before applying on a production instance!

Prepare temporary CRX instance

First step is to setup a 2nd plain temporary CRX instance with a DB-based datastore (DbDataStore) configured:

  • copy the crx-xxx-quickstart.jar and license.properties to a temporary location, e.g. /tmp/crx_dbdatastore
  • run the unpack command: java -jar crx-xxx-quickstart.jar -unpack
  • once finished, go into unpacked directory crx-quickstart/server/webapps
  • unzip the CRX-webapp into directory crx-explorer_crx: unzip -d crx-explorer_crx crx-explorer_crx.war
  • delete crx-explorer_crx.war
  • edit repository.xml

The DataStore configuration needs to be adapted to use the DbDataStore implementation, e.g.:

<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
    <param name="url" value="jdbc:h2:~/test" />
    <param name="user" value="sa" />
    <param name="password" value="sa" />
    <param name="databaseType" value="h2" />
    <param name="driver" value="org.h2.Driver"
    <param name="maxConnections" value="3"/>
    <param name="copyWhenReading" value="true"/>
    <param name="tablePrefix" value=""/>
    <param name="schemaObjectPrefix" value=""/>
    <param name="minRecordLength" value="4096"/>
</DataStore>

Once this is done:

  • copy the library containing the database-specific driver to crx-explorer_crx/WEB-INF/lib
  • rename directory crx-explorer_crx to crx-explorer_crx.war
  • start this CRX instance

During installation, CRX will create a number of binary files in the repository which should be stored in the configured database. Please verify that this is working correctly.

Mount temporary CRX instance with WebDAV

The next step is to mount the temporary CRX instance via WebDAV [2] on the server hosting the current CRX instance. In this example, the WebDAV mount-point /tmp/crxwebdav is used.

Copy files from FS to temporary instance via WebDAV

The FileDataStore persists binary files per default in <crx_home>/crx-quickstart/repository/shared/repository/datastore, using the hash of a file's content as filename. Copying this to the temporary CRX with DbDataStore configured will create the same binaries with the same hash-codes, thus can be reused in a later step.
Copy all files and folders recursively to the WebDAV mount-point:

cd <crx_home>
cp -Rv crx-quickstart/repository/shared/repository/datastore /tmp/crxwebdav

Depending on the amount of binary content, this can take some time. As soon as the recursive copy operation has finished you can unmount and stop the temporary CRX instance as well.

Reconfigure DataStore of CRX

Next step is to modify the current CRX instance's DataStore configuration (repository.xml) to use the very same DbDataStore configuration as the temporary instance. Once this is done, start the CRX instance and verify that binaries are available as before.

Cleanup

Once the above migration procedure has been verified, the following can be deleted:

  • FileDataStore: crx-quickstart/repository/shared/repository/datastore
  • temporary CRX instance

Applies

CRX1.3.x, CRX1.4.x, CRX2.x

[1] DataStore
[2] How to access the repository via WebDAV