Adobe recommends disabling full text search for binary files via the tika index. This recommendation is part of Asset Performance Tuning Helpx article.
Some common mime types to consider: mp4, pdf, zip.
Method 1
1. Install the package provided.
2. Through CRX/DE browse to the locations below:
3. Add the file mime type that needs to be disabled:
4. Click Save All.
5. Using CRX/DE, set this Boolean property refresh=true on these nodes and save:
6. Wait for the changes to take effect, test by searching for assets of the mime type added.
Method 2
1. In the AEM Web Console search for 'oak-lucene'. Note the bundle number.
2. Shutdown the AEM instance.
3. Browse to /crx-quickstart/launchpad/felix/bundlexxx directory.
4. cd to the subdirectory with versionX.Y in the name (e.g. felix/bundle102/version0.2):
cd version*
5. Extract the contents of the tika-config.xml file from the jar file:
jar -xvf bundle.jar org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
6. Edit file tika-config.xml
vi org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
For example, add the file mime type that needs to be disabled:
7. Save the changes to the bundle.jar.
jar -uvf bundle.jar org/apache/jackrabbit/oak/plugins/index/lucene/tika-config.xml
8. Restart AEM instance and test by searching for assets of the mime type added.