How to setup cold standby instance in AEM
Environment
AEM 6.0, 6.1, 6.2, 6.3 and later versions
Setting up the Primary instance
- Setup a directory for the primary instance for the Cold Standby setup.
- In the primary instance directory, unpack AEM using command java -jar quickstart.jar -unpack
- Download install.zip and place inside crx-quickstart directory
- Extract the content of the attached install.zip archive to create an install folder.
- Start the primary instance with command java -jar quickstart.jar -r primary,crx3,crx3tar
- Wait for the instance to be up and running.
- Navigate to the Web Console at: http://host:port/system/console/slinglog and create a debug logger for class:
- org.apache.jackrabbit.oak.plugins.segment (AEM6.0-AEM6.2) or
- org.apache.jackrabbit.oak.segment (AEM6.3 and higher)
- name the logger tarmk-coldstandby.log.
- Navigate to the Web Console at: http://host:port/system/console/configMgr and search for:
- Apache Jackrabbit Oak TarMK Cold Standby for 6.0-6.2 service to observe the mode and other configuration are correct and in accordance to primary instance.
- Apache Jackrabbit Oak Segment Tar Cold Standby Service for 6.3+ service to observe the mode and other configuration are correct and in accordance to primary instance.
- Navigate to http://host:port/system/console/status-slingsettings to confirm proper primary runmode is stated.
Setting up the Standby instance
1. Setup a directory for the standby instance for the Cold Standby setup.
2. Shutdown the primary instance and backup the crx-quickstart directory.
3. Restart the primary instance.
4. Copy /crx-quickstart directory from the primary instance over to standby directory
Note: This should already contain /install folder plus the other configurations
5. Search for sling.id file in /crx-quickstart directory and delete it.
6. Start the standby instance with java -jar quickstart.jar -r standby,crx3,crx3tar and wait for the instance to be up and running.
7. Navigate to the Web Console at: http://host:port/system/console/configMgr and search for:
a. Apache Jackrabbit Oak TarMK Cold Standby for 6.0-6.2 service to observe the mode and other configuration are correct and in accordance to primary instance.
b. Apache Jackrabbit Oak Segment Tar Cold Standby Service for 6.3+ service to observe the mode and other configuration are correct and in accordance to primary instance.
Note: If the above configuration still shows the mode as primary, change this to standby, save the configuration and restart the standby AEM instance. This is a one-time action required at setup time since the instance was copied from primary.
8. Navigate to http://host:port/system/console/status-slingsettings to confirm proper standby runmode is stated.
9. Tail error.log and tarmk-coldstandby.log to see more communication between the primary and standby instance.
Verifying the Standby instance
Test the standby setup:
- Navigate to http://host:port/assets.html/content/dam in the primary instance and upload an image
- Wait for a few minutes and check the same path under standby instance.
- If the image is synced up, Cold Standby Setup works as expected.
If any issues arise verify the standby instance with steps below:
- Ensure to start the primary instance with run mode "primary".
- Monitor tarmk-standby.log
Verify that a similar message like below appears:
*INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService started primary on port 8023 with allowed ip ranges [0.0.0.0-255.255.255.255].
or
*INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService started primary on port 8023 with allowed ip ranges []. - On the primary instance, navigate to the http://host:port/system/console/jmx/ and search for "Standby"
The following should appear:
Mode: primary
Status: running
Running: true - Ensure to start standby instance in "standby" run mode
- Monitor tarmk-standby.log and verify that a similar message like below appears:
(IP address will be the same as you setup in install.standby\ org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService.config)
*INFO* [FelixStartLevel] org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService started standby sync with 127.0.0.1:8023 at 5 sec.
*INFO* [CM Event Dispatcher (Fire ConfigurationEvent: pid=org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService)] org.apache.jackrabbit.oak.plugins.segment.standby.store.StandbyStoreService started standby sync with 127.0.0.1:8023 at 5 sec. - On the standby instance, navigate to the http://host:port/system/console/jmx/ and search for "Standby"
The following should appear:
FailedRequests: 0
SecondsSinceLastSuccess: 2
Mode client: dd46f264-78ec-44f6-b3f6-ad339e13d1fa
Status: running
Running: true - Back on the primary instance, navigate to the http://host:port/system/console/jmx/ and search for "Standby". Now two Standby records should appear, one for the primary as checked in #3 and another with properties like below for the standby:
TransferredBinariesBytes 0
TransferredSegmentBytes 7229892
TransferredBinaries 0
TransferredSegments 247
LastSeenTimestamp Wed Oct 05 13:38:01 EDT 2016
RemotePort 50446
RemoteAddress 127.0.0.1
Making the Standby instance Primary
- Shutdown the standby instance
- Start the instance with command java -jar quickstart.jar -r primary,crx3,crx3tar
Information to provide when raising a ticket
When raising a support ticket in the support portal, qualify the issue as best as possible following guidelines in the following KB article.
Furthermore, include the installfolder in zip format and tarmk-standby.log from both primary and standby instances.