Issue: "Error occurred while obtaining InputStream for blobId"

You have a FileDataStore configured in your Adobe Experience Manager 6.x / Oak 1.x system, and in the error.log you see "Error occurred while obtaining InputStream for blobId."

10.09.2015 10:38:04.220 *ERROR* [pool-9-thread-3] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@7fe7e8fa : Error occurred while obtaining InputStream for blobId [add1dd8fc5093b27b1fae1b753cb48b24ef3231f#101983]
java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [add1dd8fc5093b27b1fae1b753cb48b24ef3231f#101983]

Solutions

This error means that you are possibly missing files from your Adobe Experience Manager datastore directory.  The loss of datastore files can be due to a failure of Oak Blob Garbage Collection, a disk space outage, disk, or network share instability.  Or, it could be due to a user erroneously deleting files from the server.

To recover the missing files, follow the steps below.

1. Install the Latest Oak hot fix

After following the steps below to repair your datastore, then if you are using Adobe Experience Manager 6.0 or 6.1 with Oak 1.0.19 or 1.2.4 or earlier version then download and install the latest Oak hot fix.

You can find the latest hot fixes on Adobe Package Share.

2. Run a DataStore Consistency Check

The only way to get a complete list of all missing files in the datastore is to run a DataStore Consistency Check.  If your AEM instance is able to start up, then follow these steps on the running instance:

  1. Go to CRX Explorer and log in as admin.
    - http://<host>:<port>/crx (CQ5.4/CRX2.2 and earlier)
    - http://<host>:<port>/crx/explorer/index.jsp (CQ5.5/CRX2.3 and later versions)
  2. Click Repository Configuration.
  3. Click Check Repository.
  4. Select Data Store Consistency.
  5. Click Run.
  6. Keep your browser open for the duration of the process. It only outputs messages to the screen when an error is found and the process does not write any messages to the log file.  The error messages report the path of the node where the datastore record was referenced from and the record id of the missing file.
  7. When the process completes, copy the output to a text file consistency_check_output.txt and go on to the next step.

3. Create a list of paths of the missing files

The next step in recovering the missing files is to compile a complete list of the paths of the files that are missing.

1. Search the consistency check output for all occurrences of error "Record not Found."

a. In Linux or Unix: Use this command to output the list of missing files to a file missing_ds_files.txt:

grep "blobId" consistency_check_output.txt | grep -Eo "[0-9a-f]{40}" | awk '{ print substr($1, 0,2) "/" substr($1, 3,2) "/" substr($1, 5,2) "/" $1 }' | sort -u > missing_ds_files.txt

** If your Adobe Experience Manager instance is not starting up due to the "blobId" errors, then search your log files under crx-quickstart/logs for all occurrences of the error "Error occurred while obtaining InputStream for blobId" instead:

grep "Error occurred while obtaining InputStream for blobId" error.log* | grep -Eo "[0-9a-f]{40}" | awk '{ print substr($1, 0,2) "/" substr($1, 3,2) "/" substr($1, 5,2) "/" $1 }' | sort -u > missing_ds_files.txt

If the command worked properly, then the contents of missing_ds_files.txt would look similar to the following:

12/92/04/129204a6dd0ce2cd5ca19c721b6f52ee2b3630e2
9f/d8/38/9fd8386d20cf55e7e0024e18d0c7d4e8400454ee
7a/13/15/7a1315788f45dafd6630454f04183601682a9f80
28/37/d2/2837d24aed3ff223cd40e90222226c4ef2e2a0c6

  b. In Windows:  Use a text editor such as Textpad or Notepad++ to find all occurrences of "Record not found."  Then, after finding all such occurrences, extract the filenames using a macro, writing a script or by manually copying and constructing the filenames in a new text file.

DataStore file paths are constructed from the record name in this format:
{first two chars of record id}/{second two chars of record id}/{third two chars of record id}/{record id}

For example, in this error:

java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [add1dd8fc5093b27b1fae1b753cb48b24ef3231f#101983]

the record id is add1dd8fc5093b27b1fae1b753cb48b24ef3231f and the file path is ad/d1/dd/add1dd8fc5093b27b1fae1b753cb48b24ef3231f

4. Recover the missing files

Now use the output from the last step to hunt down the same files in other Adobe Experience Manager instances in your environment.  Since the datastore files are stored uniquely, you can copy them from other Adobe Experience Manager instances in your environment.

If you cannot find some of the files in other instances, then search your backups and restore them from there if possible.

In Linux, you could log in to each of the working Adobe Experience Manager instances and use a command like rsync to copy over any of the missing files that exist in them.  For example, by running it from the datastore directory on the server that has the missing files:

rsync -avR --files-from=missing_ds_files.txt . user@hostname-of-server-missing-files:/path/to/crx-quickstart/repository/repository/datastore/

The command run an rsync that copies over any files listed in missing_ds_files.txt that exist in the server. 

5. Clean up unrecoverable file references

If you were not able to recover some of the files from backup or from other Adobe Experience Manager instances then clean up or fix the bad datastore references.  Rerun the DataStore Consistency Check as we ran it in step 4.  You get a current list of missing files.

Review each of the node paths listed which are referencing missing datastore files.  Review any missing DAM assets or files uploaded to pages with your user.  Have them reupload any missing ones they need.  Any they don't need can safely be deleted via the Adobe Experience Manager user interface.  If anything is missing from under /var/audit or /var/eventing can safely be deleted.  For any files you are not sure about, then go here and contact the AEM support team for assistance.

6. Reindex all async Oak indexes

Due to Oak's async indexing model, there could be missing files related to Oak asynchronous indexes.  Those missing files would not get reported in the consistency check as they are hidden from repository traversals.  Unfortunately such files would not match across different AEM instances so the only way to fix such inconsistencies is to reindex all asynchronous Oak indexes.

To reindex the async indexes:

1. Download oak-run 1.0.x matching the version you have installed in your Oak environment: https://repo1.maven.org/maven2/org/apache/jackrabbit/oak-run/

2. Upload the oak-run to the Adobe Experience Manager server.

3. Stop all Adobe Experience Manager instances.

4. Run the command below corresponding to your Adobe Experience Manager instance's Oak storage:

TarMK command:

java -Xmx4096m -jar /apps/staging/oak-run-*-*.jar checkpoints crx-quickstart/repository/segmentstore rm-all

MongoMK command:

java -Xmx4096m -jar oak-run-*-*.jar checkpoints mongodb://localhost/aem-author rm-all

5. Start Adobe Experience Manager again and monitor your log files for INFO log messages from org.apache.jackrabbit.oak.plugins.index.  If you want to see more details on the indexing then go to /system/console/slinglog UI and enable debug level logging for org.apache.jackrabbit.oak.plugins.index.

You should see log messages like these:

23.06.2015 14:26:23.070 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes: [/oak:index/cqAcUUID, /oak:index/nodetype, /oak:index/deviceIdentificationMode, /oak:index/campaignpath, /oak:index/active, /oak:index/jcrFrozenMixinTypes]

23.06.2015 14:26:23.517 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #10000 /jcr:system/jcr:versionStorage/c8/5f
...
23.06.2015 14:28:51.999 *INFO* [pool-8-thread-1] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Indexing report
    - /oak:index/counter*(708)
    - /oak:index/authorizables*(159)
    - /oak:index/cqPageLucene*(1913)
    - /oak:index/ntBaseLucene*(444)
    - /oak:index/cqTagLucene*(512)
    - /oak:index/workflowDataLucene*(116)
...

23.06.2015 14:28:52.009 *INFO* [pool-8-thread-1] org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate Reindexing (async) completed for indexes: [/oak:index/counter*(708), /oak:index/authorizables*(159), /oak:index/cqPageLucene*(1913), /oak:index/ntBaseLucene*(444), /oak:index/cqTagLucene*(512), /oak:index/workflowDataLucene*(116)] in 30.36 s

6. If indexing fails for some reason, then you would see it looping, repeating this log message:

23.06.2015 14:26:23.070 *INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes: [/oak:index/cqAcUUID, /oak:index/nodetype, /oak:index/deviceIdentificationMode, /oak:index/campaignpath, /oak:index/active, /oak:index/jcrFrozenMixinTypes]

Applies to

AEM6.x / Oak 1.x

עבודה זו בוצעה ברישיון של Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License  הודעות המתפרסמות ב- Twitter™‎ ו- Facebook אינן מכוסות בתנאי Creative Commons.

הצהרות משפטיות   |   מדיניות פרטיות מקוונת