Disaster Recovery (4.3)

If disaster hits, it might be necessary to restore the system from backup. 

  1. Recreate potential lost IaC managed infrastructure resources and EKS cluster
    Time estimate: 1.5 hours

  2. Restore EFS data from AWS Backup, see https://docs.aws.amazon.com/aws-backup/latest/devguide/restoring-efs.html 
    Time estimate: 1 hour

  3. As MemoryDB support cross-availability zone replication by default, it is most often not necessary to recover the database from a snapshot. In case of multi-availability zone failure, accidental deletion or corruption of the database, it could be necessary to recover a database snapshot. This is described in https://docs.aws.amazon.com/memorydb/latest/devguide/snapshots-restoring.html.
    Time estimate: 0-30 minutes

  4. Restore RDS database using:

    1. PIT recovery if available https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIT.html
      Important: The point in time to restore the database from must be older than the oldest snapshot from steps 2 and 3.

    2. If for some reason (multi AZ failure or manual fault) data to perform PITR is not available, the RDS database needs to be recovered from a snapshot. This is described in https://docs.aws.amazon.com/aws-backup/latest/devguide/restoring-rds.html.
      Time estimate: 30 minutes for PITR, 1 hour for snapshot recovery

  5. Reinstalling platform application and configure it to use the recovered RDS database from step 4
    Time estimate: 30 minutes

  6. Re-deploy solution deployment resources
    Solution resources, "ECDs" need to be reinstalled. The recommendation is to store ECDs as Helm charts in version control and deploy them through automated CI/CD pipeline. For a description of the process to setup such CI/CD pipelines, see Install the Example CI/CD Pipeline(4.3)
    Time estimate: 30 minutes

 

Total estimated time for full recovery (RTO): 5 hours

Â