There can be many causes for user synchronization to fail. The most common are:
- Misconfiguration
- An error on saving when the user is on the publish instance
- System failure to save the user package due to an error or permission issue (Author or Publish instance)
- Sling jobs getting stuck due to missing user package
** This only applies to AEM 6.2.
Update the socialpubsync-vlt Vault Package Builder Factory to address these items:
- store the user synchronization packages on the file system on non-clustered instances to increase stability and performance.
- include rep:policy nodes and avoid .token and rep:cache nodes being synchronized with the user
- Avoid the error "cannot retrieve packages" [1]
-
In the Package Filters field, add these values:
- /home/users|-.*/rep:cache
- /home/users|-.*/.tokens
- /home/users|-.*/rep:policy
Note:
Adding rep:cache here avoids the error below:
[... POST /libs/sling/distribution/services/importers/socialpubsync HTTP/1.1] org.apache.jackrabbit.vault.packaging.impl.ZipVaultPackage Error during install.
javax.jcr.nodetype.ConstraintViolationException: OakConstraint0034: Attempt to create or change the system maintained cache.
If you are running AEM6.2, then install Cumulative Fix Pack 3 to all Author and Publish instances or contact AEM Customer Care to request hotfix for NPR-13034 . If you don't install these, then the above configuration would have no effect.
** This only applies to AEM 6.2.
There is a problem with the default configuration in User Sync where it doesn't distribute the sling:Folder nodes such as social/relationships/following.
Depending on the version of Sling Distribution and AEM Social Communities you have, you might have user packages created under /etc/packages/sling (older versions - AEM6.1 with no hotfixes) or /var/sling/distribution/packages (newer versions - AEM6.1 with AEM Social Communities FP4 or later.
If you already have the type field set to file packages in your Vault Package Builder Factory configuration, then you have to clear the packages from the temp folder:
Since you have cleared out the packages, delete all the stuck Sling jobs that reference them. If old jobs under /var/eventing/jobs/unassigned are not processing due to some error, then they could cause User Sync to fail. Delete those on each AEM node to unblock the synchronization queue:
If none of the above steps fixed the issue with user synchronization, then enable debug level logging for these java packages (Author and Publish instances):
22.08.2017 12:38:16.044 *ERROR* [sling-default-655-scheduledEventTriggerorg.apache.sling.distribution.agent.impl.TriggerAgentRequestHandler@3b05483d] org.apache.sling.distribution.agent.impl.SimpleDistributionAgent [agent][socialpubsync] cannot retrieve packages org.apache.sling.distribution.common.DistributionException: java.lang.NullPointerException at org.apache.sling.distribution.packaging.impl.FileDistributionPackageBuilder.readPackageInternal(FileDistributionPackageBuilder.java:127) at org.apache.sling.distribution.packaging.impl.AbstractDistributionPackageBuilder.readPackage(AbstractDistributionPackageBuilder.java:111) at org.apache.sling.distribution.serialization.impl.vlt.VaultDistributionPackageBuilderFactory.readPackage(VaultDistributionPackageBuilderFactory.java:243) at org.apache.sling.distribution.transport.impl.SimpleHttpDistributionTransport.retrievePackage(SimpleHttpDistributionTransport.java:156) at org.apache.sling.distribution.packaging.impl.exporter.RemoteDistributionPackageExporter.exportPackages(RemoteDistributionPackageExporter.java:82) at org.apache.sling.distribution.agent.impl.SimpleDistributionAgent.exportPackages(SimpleDistributionAgent.java:214) at org.apache.sling.distribution.agent.impl.SimpleDistributionAgent.execute(SimpleDistributionAgent.java:182) at org.apache.sling.distribution.agent.impl.TriggerAgentRequestHandler.handle(TriggerAgentRequestHandler.java:71) at org.apache.sling.distribution.trigger.impl.ScheduledDistributionTrigger$ScheduledDistribution.run(ScheduledDistributionTrigger.java:134) at org.apache.sling.commons.scheduler.impl.QuartzJobExecutor.execute(QuartzJobExecutor.java:118) at org.quartz.core.JobRunShell.run(JobRunShell.java:202) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException: null