Deadlock tra OakDiscoveryService.unbindTopologyEventListener e OakViewChecker.discoveryLiteCheck

Problema

AEM6.2 è completamente bloccato dopo aver installato un hotfix per la distribuzione dei bundle di applicazioni.  Dopo aver catturato le immagini thread e averle aperte in un analizzatore di immagini thread (come http://fastthread.io o IBM Thread Analyzer), l'analizzatore di thread riporta un deadlock.  

Il Deadlock riportato si trova tra due thread come quelli sotto riportati:

Thread [1] chiama org.apache.sling.discovery.oak.OakDiscoveryService.unbindTopologyEventListener

Thread [2] chiama org.apache.sling.discovery.oak.pinger.oak.pinger.OakViewChecker.discoveryLiteCheck

A volte il deadlock può essere un circular deadlock con un terzo thread coinvolto.

[1]

"LeaseFailureHandler-Thread" daemon prio=5 tid=0x7f34 nid=0xffffffff in Object.wait()
   java.lang.Thread.State: WAITING (on object monitor)
        at sun.misc.Unsafe.park(Native Method)
        - waiting to lock <0x575a7644> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) owned by "discovery.connectors.common.runner.e8dd34be-8886-4e5d-891f-5509b4dea0f0.discoveryLiteChec
k" tid=0x4,330
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
        at java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
        at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
        at org.apache.sling.discovery.oak.OakDiscoveryService.unbindTopologyEventListener(OakDiscoveryService.java:368)
...
        at org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.unregisterFactory(ResourceResolverFactoryActivator.java:611)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.unregisterFactory(ResourceResolverFactoryActivator.java:602)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.checkFactoryPreconditions(ResourceResolverFactoryActivator.java:674)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator.access$100(ResourceResolverFactoryActivator.java:79)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryActivator$1.providerRemoved(ResourceResolverFactoryActivator.java:500)
        at org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.unregister(ResourceProviderTracker.java:224)
        - locked <0x35cd7613> (a java.util.HashMap)
        at org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.access$100(ResourceProviderTracker.java:58)
        at org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker$1.removedService(ResourceProviderTracker.java:109)
...
        at org.apache.felix.framework.ServiceRegistrationImpl.unregister(ServiceRegistrationImpl.java:144)
        at org.apache.sling.jcr.base.AbstractSlingRepositoryManager.unregisterService(AbstractSlingRepositoryManager.java:262)
        at org.apache.sling.jcr.base.AbstractSlingRepositoryManager.stop(AbstractSlingRepositoryManager.java:389)
...
        at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:1038)
        at org.apache.felix.framework.BundleImpl.stop(BundleImpl.java:1024)
        at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService$1.handleLeaseFailure(DocumentNodeStoreService.java:413)
        at org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo$1.run(ClusterNodeInfo.java:696)
        at java.lang.Thread.run(Thread.java:745)


   Locked ownable synchronizers:
        - locked <0x4402f4fd> (a java.util.concurrent.locks.ReentrantLock$FairSync)
        - locked <0x1e2230ed> (a java.util.concurrent.locks.ReentrantLock$FairSync)
        - locked <0x56ba270f> (a java.util.concurrent.locks.ReentrantLock$FairSync)

[2]

"discovery.connectors.common.runner.e8dd34be-8886-4e5d-891f-5509b4dea0f0.discoveryLiteCheck" daemon prio=5 tid=0x10ea nid=0xffffffff waiting for monitor entry
   java.lang.Thread.State: BLOCKED
        at org.apache.sling.resourceresolver.impl.providers.ResourceProviderTracker.getResourceProviderStorage(ResourceProviderTracker.java:364)
        - waiting to lock <0x35cd7613> (a java.util.HashMap) owned by "LeaseFailureHandler-Thread" tid=0x32,564
        at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.createControl(ResourceResolverImpl.java:154)
        at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:116)
        at org.apache.sling.resourceresolver.impl.ResourceResolverImpl.<init>(ResourceResolverImpl.java:110)
        at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getResourceResolverInternal(CommonResourceResolverFactoryImpl.java:257)
        at org.apache.sling.resourceresolver.impl.CommonResourceResolverFactoryImpl.getAdministrativeResourceResolver(CommonResourceResolverFactoryImpl.java:140)
        at org.apache.sling.resourceresolver.impl.ResourceResolverFactoryImpl.getAdministrativeResourceResolver(ResourceResolverFactoryImpl.java:107)
        at org.apache.sling.discovery.oak.cluster.OakClusterViewService.getResourceResolver(OakClusterViewService.java:103)
        at org.apache.sling.discovery.oak.cluster.OakClusterViewService.getLocalClusterView(OakClusterViewService.java:110)
        at org.apache.sling.discovery.base.commons.BaseDiscoveryService.getTopology(BaseDiscoveryService.java:77)
        at org.apache.sling.discovery.oak.OakDiscoveryService.checkForTopologyChange(OakDiscoveryService.java:657)
        at org.apache.sling.discovery.oak.pinger.OakViewChecker.discoveryLiteCheck(OakViewChecker.java:232)
        - locked <0x740a9729> (a java.lang.Object)
        at org.apache.sling.discovery.oak.pinger.OakViewChecker.access$000(OakViewChecker.java:64)
        at org.apache.sling.discovery.oak.pinger.OakViewChecker$1.run(OakViewChecker.java:208)
        at org.apache.sling.discovery.base.commons.PeriodicBackgroundJob.safelyRun(PeriodicBackgroundJob.java:86)
        at org.apache.sling.discovery.base.commons.PeriodicBackgroundJob.run(PeriodicBackgroundJob.java:77)
        at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
        - locked <0x575a7644> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

Causa

Conosciuto bug Apache Sling: SLING-5622.  Questo problema si applica solo alle installazioni pre-SP1 di AEM 6.2.  Se SP1 è già stato applicato con successo, allora non si verifica questo problema.

Risoluzione

Questo problema si verifica solo prima dell'installazione di SP1.  Per risolvere il problema, contattare il Servizio Clienti AEMe richiedere hotfix 11473.  La correzione è già inclusa nel Service Pack 1.

Logo Adobe

Accedi al tuo account