This article explains how to Failback a Linux Protection Group from a Recovery Datacenter back to its Production Datacenter. The user should make sure that the production site is available before initiating a Failback.
NOTE: Failback is a disruptive operation since the Recovery Server is powered OFF and the production server is Powered ON during the operation.
- Verify that the protection group is in a healthy state.
- Verify that the Production server is available for failback.
This article assumes that the Protection group is in a failed over state and the production VM is available to initiate a failback.
For the purpose of this article, we are using Ubuntu 14 as a production server in CenturyLink's CA2(Toronto) production datacenter. The recovery site is CenturyLink's WA1(Washington) recovery datacenter.
Once the production site is ready for failback, right-click the protection group on the DR Site and click Failback.
Leave the Automated Power Operation box checked and click Next. This automatically shuts down the production server, if it is still ON to avoid a "split-brain" scenario in which both the production and recovery servers are ON at the same time.
Synchronization of data from the DR SRN to the Production SRN is initiated. It may take some time to synchronize all of the changes on the DR side with the production side, depending on the number of changes written in the Recovery Server after Failover.
Wait until the Failback Resync is complete. Then, click Next.
Uncheck the Auto-Stub configuration box and click Next.
Check the boxes for Manual setup needed and Manual Shutdown needed. Then, click Next.
Leave the Skip Test Failover box unchecked. Choose a clean and 0 Bytes checkpoint. Then, click Next.
Wait till Test Failover and Power on have completed. The production VM should be powered ON.
The Production server is now configured to iSCSI boot using the disks of the Production SRN instead of its own local disks. We strongly recommend taking a snapshot of the Production VM at this point from the CenturyLink Control Portal.
Log in to the production server.
Go to the "Safehaven_linux_onboarding_scripts" directory.
Run the makestub.sh stript with -d to run it with default parameters.
After the script finishes running successfully, reboot the server.
Run lsblk and verify that the server is now booting from the iSCSI target. In the image below the boot disk is sdd(iscsi disk), instead of sda(local disk).
Click on Run Test Failover delete to delete the test failover clone. This automatically shuts down the production server after deleting the test-failover clone.
After Power Off and Delete Failover Clone are complete, click Next.
Confirm that the Unsynchronized Data is 0 Bytes and the Connection Status is Active. Then, click Next.
Wait until the Connection Status changes to Active. Then, click Finish to exit the wizard.
THE NEXT STEP IS BOOTING FROM THE PRIMARY DATASTORE.