PDF upload causes AEM java process to crash

Issue

When you upload certain PDF files to AEM, the OS memory utilization by the JVM is too high. The java process either hangs or crashes. On Linux, the OS kills the java process.

Thread dumps show similar threads that cause high CPU utilization: 

"JobHandler: /etc/workflow/instances/server0/2017-03-08_4/update_asset_6:/content/dam/test.pdf/jcr:content/renditions/original" #3270 daemon prio=1 os_prio=-2 tid=0x0000000020802000 nid=0xc7d0 runnable [0x00000000396be000]
java.lang.Thread.State: RUNNABLE
at sun.java2d.cmm.lcms.LCMS.createNativeTransform(Native Method)
at sun.java2d.cmm.lcms.LCMS.createTransform(LCMS.java:156)
at sun.java2d.cmm.lcms.LCMSTransform.doTransform(LCMSTransform.java:155)
locked <0x00000000f55ab218> (a sun.java2d.cmm.lcms.LCMSTransform)
at sun.java2d.cmm.lcms.LCMSTransform.colorConvert(LCMSTransform.java:629)
at java.awt.color.ICC_ColorSpace.toRGB(ICC_ColorSpace.java:182)
at com.adobe.internal.pdftoolkit.color.ColorManager.convertICCToDeviceRGB(ColorManager.java:167)
at com.adobe.internal.pdftoolkit.pdf.graphics.impl.ColorSpaceCacheImpl.toRGB(ColorSpaceCacheImpl.java:175)
at com.adobe.internal.pdftoolkit.services.pdfParser.ParserUtils.preProcessAxialShading(ParserUtils.java:264)
at com.adobe.internal.pdftoolkit.services.pdfParser.ContentStreamParser.applyShading(ContentStreamParser.java:1713)
at com.adobe.internal.pdftoolkit.services.pdfParser.ContentStreamParser.sh(ContentStreamParser.java:1672)
at com.adobe.internal.pdftoolkit.pdf.content.processor.ShadingPatternOperator.process(ContentOperators.java:790)
at com.adobe.internal.pdftoolkit.pdf.content.processor.ContentStreamProcessor.process(ContentStreamProcessor.java:103)
at com.adobe.internal.pdftoolkit.services.pdfParser.PDFContentItemsList$PDFContentItemsListIterator.processObjects(PDFContentItemsList.java:176)
at com.adobe.internal.pdftoolkit.services.pdfParser.PDFContentItemsList$PDFContentItemsListIterator.hasNext(PDFContentItemsList.java:127)
at com.adobe.internal.pdftoolkit.services.rasterizer.impl.RasterDocument.createPage(RasterDocument.java:129)
at com.adobe.internal.pdftoolkit.services.rasterizer.impl.PDFToRasterConverter.toBufferedImage(PDFToRasterConverter.java:127)
at com.adobe.internal.pdftoolkit.services.rasterizer.PageRasterizer.next(PageRasterizer.java:98)
at com.day.cq.dam.handler.standard.pdf.PdfHandler.getImage(PdfHandler.java:592)
at com.day.cq.dam.handler.standard.pdf.PdfHandler.getImage(PdfHandler.java:555)
at com.day.cq.dam.core.process.CreatePdfPreviewProcess.execute(CreatePdfPreviewProcess.java:109)
at com.day.cq.workflow.compatibility.CQWorkflowProcessRunner.execute(CQWorkflowProcessRunner.java:93)
at com.adobe.granite.workflow.core.job.HandlerBase.executeProcess(HandlerBase.java:189)
at com.adobe.granite.workflow.core.job.JobHandler.process(JobHandler.java:244)
at org.apache.sling.event.impl.jobs.JobConsumerManager$JobConsumerWrapper.process(JobConsumerManager.java:500)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.startJob(JobQueueImpl.java:291)
locked <0x00000000c91531e0> (a org.apache.sling.event.impl.jobs.queues.JobExecutionContextImpl)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl.access$100(JobQueueImpl.java:58)
at org.apache.sling.event.impl.jobs.queues.JobQueueImpl$1.run(JobQueueImpl.java:227)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Environment

AEM 6.2 on Oracle Java 1.8

Resolution

A. Clear out the update asset workflow sling jobs

  1. Go to http://host:port/crx/de/index.jsp and log in as admin user

  2. Browse to this path where {slingid} is the sling id of the AEM instance

    /var/eventing/jobs/assigned/{slingid}/com.adobe.granite.workflow.job.etc.workflow.models.dam.update_asset.jcr_content.model
  3. Right click and delete the node com.adobe.granite.workflow.job.etc.workflow.models.dam.update_asset.jcr_content.model

  4. Save

  5. Restart AEM

This deletes all the active sling jobs for PDFs that are causing crash in AEM.

B. Install the PDF Rasterizer command-line tool

  1. Contact AEM Customer Care to obtain the cq-dam-pdfrasterizer*.zip file for your OS

  2. Upload the zip file to your AEM server

  3. Unzip the zip file to a new folder on the server

  4. Upload a PDF file to the server in the same folder as where you extracted the zip file

  5. Run this command in that folder to make sure that the rasterizer works (replace pdffilename.pdf with the filename of the pdf document)

    /opt/aem/author62/rasterizer/PDFRasterizer -d -p 1 -s 1280 -t PNG -i pdffilename.pdf

    Output would be something like this:

    Total time in image conversion (in ms): 163
    Total time (in ms): 896. Time to initialize: 12. Time for conversion: 884
  6. In addition to the steps on that page you must also modify the "DAM Update Asset" workflow step "Rasterize PDF/AI Image Preview Rendition". Remove "application/pdf" from the list of "Mime Types"

  7. Assuming the rasterizer works ok on the command-line, then follow all the steps in this articleto configure it to work in AEM

  8. Click "Save" to save the workflow model

  9. Now try uploading PDFs and the renditions should be generated very fast and without high memory utilization

 Adobe

Saage abi kiiremini ja hõlpsamalt

Uus kasutaja?