Objective
This articles describes the most common AEM Assets ingestion issue sceanrios and how to analyze them.
- High Processing
- High Volume
- Large DAM Repositories
- Many Concurrent Authors
Scenario 1: High processing
Situations like bulk imports, such as 2000 images at once, causes author instances high CPU and memory.
Solution: Offloading jobs to another AEM instance. You can offload entire workflows or just a few heavy steps by connecting processing instance to the primary author instances via DAM proxy workers. The primary author instance thereby remains free to serve other users. DAM proxy workers are in charge of supervising remote tasks, gathering the results, and feeding them to the local workflow execution.
Scenario 2: High Volume
Instances where database of few million products that have 12,000 modifications per day. The repository becomes the bottleneck in such sceanrios. While writes are happening, reads are blocked for consistency purposes.
Solution: To prevent this situation, segregate the import process on a dedicated author instance with its own repository. At completion, replicate a full delta to the author environment, with chained replication to the publish environment, if necessary. Use a reserved replication queue to avoid delaying important editorial changes from publication.
Scenario 3: Large DAM repositories. Huge repositories, such as over 7 million assets, 20 million nodes, and 15TB disk size. This impacts the instance performance.
Solution: Split the persistent store and the data store (optimized for handling large binaries). The persistent store requires very low latency I/O, hence local storage works best. For the data store, a higher latency is acceptable.
Scenario 4: Many Concurrent Authors can impact the performance and the processing.
Solution: Concurrent authors are users who are actively working on the system. Logged-in but inactive authors do not place additional load on the system. Operations like editing, upload assets, trigger workflows CPU, memory, search and download assets, modify metadata. Forming a cluster of author instances with a dispatcher in front helps distribute the CPU load evenly. With a large number of authors in active production, it is recommended to spin off each project into a separate author instance or environment in which the work in progress takes place. This technique is named content partitioning.