Environment

AEM 6.1, 6.2, 6.3, 6.4

How to use custom tika configuration to disable full text search based on a file's mime type in AEM

Adobe recommends disabling full text search for binary files via the tika index.  This recommendation is part of Asset Performance Tuning Helpx article.

Some common mime types to consider: mp4, pdf, zip. 

Method 1

1. Install the package provided. 

2. Through CRX/DE browse to the locations below:

/oak:index/lucene/tika/config.xml
/oak:index/damAssetLucene/tika/config.xml

3. Add the file mime type that needs to be disabled: 

<mime>application/zip</mime>

4. Click Save All

5. Using CRX/DE, set this Boolean property refresh=true on these nodes and save:
/oak:index/lucene
/oak:index/damAssetLucene

6. Wait for the changes to take effect, test by searching for assets of the mime type added.
 

 

Method 2

1. In the AEM Web Console search for 'oak-lucene'.  Note the bundle number.  

2. Shutdown the AEM instance.  

3. Browse to /crx-quickstart/launchpad/felix/bundlexxx directory.  

4. Extract the contents of the jar file to a temporary location.

Example:  /bundle102/version0.2/bundle/org/apache/jackrabbit/oak/plugins/index/lucene

5. Edit file tika-config.xml. 

Add the file mime type that needs to be disabled: 

<mime>application/zip</mime>

6. Save the changes to the bundle.jar. 

7. Restart AEM instance and test by searching for assets of the mime type added.  

Download

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License  Twitter™ and Facebook posts are not covered under the terms of Creative Commons.

Legal Notices   |   Online Privacy Policy