Login to AEM server and open the crx-quickstart/install/org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.config file in an editor
Issue
In a multi-node AEM cluster using S3DataStore, if you upload large images or videos (100 MB+) to the Assets UI via a secondary topology node, then errors similar to the one below would be logged.
17.04.2017 23:53:27.106 *ERROR* [JobHandler: /etc/workflow/instances/server0/2017-04-17/update_asset_5:/content/dam/test.jpg/jcr:content/renditions/original] com.day.cq.dam.core.impl.cache.CQBufferedImageCache Error while loading image /content/dam/rrd/dsg/migration-test/to-rrd/aem-tester/Perf1492487597162_Performance_100MB_JPG.jpg/jcr:content/renditions/original java.lang.RuntimeException: Error occurred while obtaining InputStream for blobId [c260220808fc2ee8550a97daea0444123b02f90b#103376192] at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49) at org.apache.jackrabbit.oak.plugins.value.BinaryImpl.getStream(BinaryImpl.java:53) at org.apache.sling.jcr.resource.internal.helper.LazyInputStream.getStream(LazyInputStream.java:106) at org.apache.sling.jcr.resource.internal.helper.LazyInputStream.read(LazyInputStream.java:65) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792) at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769) at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744) at com.day.cq.dam.commons.handler.StandardImageHandler.getImage(StandardImageHandler.java:245) at com.day.cq.dam.core.impl.handler.JpegHandler.getImage(JpegHandler.java:79) at com.day.cq.dam.core.impl.cache.CQBufferedImageCache.getImage(CQBufferedImageCache.java:116) at com.day.cq.dam.core.impl.gfx.CommonsGfxRenderer.createImageLayer(CommonsGfxRenderer.java:488) at com.day.cq.dam.core.impl.gfx.CommonsGfxRenderer.createLayer(CommonsGfxRenderer.java:163) at com.day.cq.dam.core.impl.gfx.CommonsGfxRenderer.render(CommonsGfxRenderer.java:116) at com.day.cq.dam.core.impl.RenditionMakerImpl$PlanBasedTemplate.apply(RenditionMakerImpl.java:132) at com.day.cq.dam.core.impl.RenditionMakerImpl.generateRenditions(RenditionMakerImpl.java:216) at com.day.cq.dam.core.process.CreateWebEnabledImageProcess.createWebEnabledImage(CreateWebEnabledImageProcess.java:165) at com.day.cq.dam.core.process.ThumbnailProcess.execute(ThumbnailProcess.java:82) at com.day.cq.workflow.compatibility.CQWorkflowProcessRunner.execute(CQWorkflowProcessRunner.java:93) ... Caused by: java.io.IOException: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier c260220808fc2ee8550a97daea0444123b02f90b at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:539) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getInputStream(DataStoreBlobStore.java:322) at org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:47) ... 26 common frames omitted Caused by: org.apache.jackrabbit.core.data.DataStoreException: Could not length of dataIdentifier c260220808fc2ee8550a97daea0444123b02f90b at org.apache.jackrabbit.oak.blob.cloud.aws.s3.S3Backend.getLength(S3Backend.java:485) at org.apache.jackrabbit.core.data.CachingDataStore.getLength(CachingDataStore.java:671) at org.apache.jackrabbit.core.data.CachingDataStore.getRecord(CachingDataStore.java:469) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getDataRecord(DataStoreBlobStore.java:548) at org.apache.jackrabbit.oak.plugins.blob.datastore.DataStoreBlobStore.getStream(DataStoreBlobStore.java:533) ... 28 common frames omitted Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 09ED6CDC6129E1DA) at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1389) at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:902) at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:607) at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376) at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1015) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:991) at org.apache.jackrabbit.oak.blob.cloud.aws.s3.S3Backend.getLength(S3Backend.java:478) ... 32 common frames omitted
Environment
AEM Assets 6.1, 6.2 on MongoMK using S3DataStore
Cause
The cause of this issue is due to the async upload of the binary to S3 taking longer than job processing in the cluster. The leader node processes the DAM Update Asset workflow jobs, but the job gets processed before the file is uploaded to S3. See OAK-4903 for further details.
Resolution
This issue is fixed in AEM6.3.
A. Disable S3 Async Uploads
As a workaround in AEM6.1 and 6.2, do the following:
-
-
Set this property
asyncUploadLimit="0"
-
Restart AEM
Setting the asyncUploadLimit to 0 will have a performance impact on asset file uploads. After setting this, you will need to perform load testing to make sure the environment still meets performance needs.
B. Enable Job Distribution
Another approach to solve the issue without a performance hit is to enable sling job distribution. You can then process
sling jobs directly on the secondary cluster nodes in addition to the primary.
To do this, follow the steps:
-
Go to http://aem-host:port/system/console/configMgr
-
Find the Apache Sling Job Manager and click edit (pencil icon to the right)
-
Uncheck Disable Distribution
-
Click Save