How to check and repair search index inconsistencies

Question / Problem

Sometimes it can happen that the Lucene index of a repository gets into an inconsistent state, especially if the repository is shutdown abnormally or the corresponding Java process is killed unexpectedly. Depending on the severity of an index inconsistency, the repository might not be able to startup correctly anymore.

Answer / Resolution

The LuceneHandler class of CRX is extending the SearchIndex [1] class of Jackrabbit which provides following configuration parameters to check and repair index inconsistencies:

Parameter

Default Value

Description

enableConsistencyCheck

false

If set to true, a consistency check is performed depending on the parameter forceConsistencyCheck. If set to false, no consistency check is performed on startup.

forceConsistencyCheck

false

Runs a consistency check on every startup. If false, a consistency check is only performed when the search index detects a prior forced shutdown.

autoRepair

true

Errors detected by a consistency check are automatically repaired. If false, errors are only written to the log.

In order to trigger an index consistency-check, simply reconfigure the corresponding workspace.xml configuration file of the workspace in question, e.g. <crx_home>/crx-quickstart/repository/workspaces/crx.default/workspace.xml:

workspace.xml

<SearchIndex class="com.day.crx.query.lucene.LuceneHandler">
    ...
    <param name="enableConsistencyCheck" value="true" />
    <param name="forceConsistencyCheck" value="true" />
    <param name="autoRepair" value="true" />
</SearchIndex>

Once this configuration is in place, a consistency-check of the index is run on startup which - depending on the volume of the repository and the size of the index - can take some time to finish. In the error.log of CRX, the progress of the check can be monitored:

*INFO * SearchIndex: Running consistency check...
*INFO * ConsistencyCheck: progress: 10%
*INFO * ConsistencyCheck: progress: 20%
...

Please keep in mind to remove above configuration again, otherwise the consistency-check is triggered on every restart of the repository which slows down the startup time. Setting the value of the enableConsistencyCheck to false also disables the check altogether.

Completely rebuild the index

To completely re-create the index, stop the repository, delete the index directories, and start the repository. Rebuilding the index may take a few hours. The index directories are stored separately for each cluster node, in the following directories:

  • crx-quickstart/repository/repository/index
  • crx-quickstart/repository/workspaces/crx.default/index
  • crx-quickstart/repository/workspaces/crx.system/index

Applies

CRX >= 1.4.x

 Adobe

Get help faster and easier

New user?

Adobe MAX 2024

Adobe MAX
The Creativity Conference

Oct 14–16 Miami Beach and online

Adobe MAX

The Creativity Conference

Oct 14–16 Miami Beach and online

Adobe MAX 2024

Adobe MAX
The Creativity Conference

Oct 14–16 Miami Beach and online

Adobe MAX

The Creativity Conference

Oct 14–16 Miami Beach and online