Backup (4.3)

Resources to Backup

Since an installation in AWS is done using Infrastructure as Code, it is not necessary to back up the application itself. The resources that need to be backed up are

EFS

If you have used another NFS solution instead of EFS (FSx suite e.g.), that is fully comparable, however backup instructions would have to be looked up separately.

System database (hosted in RDS database, or in case of Derby EFS disk storage)
Disk-based Aggregation data (Stored on EFS storage)
Database based Aggregation data (Stored in Redis storage, like MemoryDB)
Duplicate UDR detection (Stored on EFS disk storage)

General Requirements

Regardless of the method chosen for the provisioning of backup and restore operations, there are requirements that must be met:

When an instance is restored from a backup image, the database backup must be older than the disk backup. The reason for this is to avoid database references to non-existing data.
Temporary files must be excluded from the backups. This is to avoid inconsistent or partial data.

Temporary Data

Temporary files can be identified by various identifiers that form the file names:

Temporary files – Prefixed with "TEMP."
DupUDR function temporary files – They are in a separate folder called "tmp".
InterWorkFlow temporary files – Stored in a separate folder called "DR_TMP_DIR".
Archiving function files – Stored in a separate folder called "pending".

Creating a Backup Plan

1. Setup Cold Backup Window

If EFS is used to store Aggregation, Duplicate UDR or InterWorkflow data, backups of EFS should be taken as "cold backups" which means workflow processing must be suspended during the duration of the backup by using Cold Backups (4.3) functionality. Follow those instructions to setup a cold backup window at a specified time. During the cold backup window, no processing will happen in the system.

If no disk based storage is used for Aggregation, Duplicate UDR or InterWorkflow, there is no need to configure cold backups.

2. Configure EFS Backup

If available in your region, the recommendation is to backup the EFS file system using AWS Backup, see https://docs.aws.amazon.com/efs/latest/ug/awsbackup.html.

See https://docs.aws.amazon.com/aws-backup/latest/devguide/create-auto-backup.html for details.

Configure the following:

Schedule – When the backup occurs. Make sure to align this with you cold backup window defined in previous step!
Backup window – The window of time in which the backup needs to start. Make sure to align this with you cold backup window defined in previous step!
Lifecycle – When to move a recovery point to cold storage and when to delete it
Backup vault – Used to organize recovery points created by the Backup rule.
Backups should be taken as snapshot backups. I.e do not enable "continuous backups for point-in-time recovery (PITR)" as that can lead to unpredictable behaviour.

If AWS Backup is not available as a service in your region, the recommendation is to use EFS to EFS backup, see https://aws.amazon.com/solutions/implementations/efs-to-efs-backup-solution/.

3. Configure RDS Backup

The RDS service host the system database in PostgreSQL. RDS has built in point-in-time backup functionality enabled by default. This guarantees that the database can always be reverted to a point in time that is no older than 5 minutes, as long as the backup data exists within the availability zone. See https://aws.amazon.com/rds/features/backup/ for details.

To protect against availability zone failures, a production setup should always be setup with a Standby Instance.

To protect against the unlikely case of region failure, cross region backup can be configured using AWS Backup. See https://aws.amazon.com/getting-started/hands-on/amazon-rds-backup-restore-using-aws-backup/. If AWS Backup is not available as a service in your region, you can instead rely on RDS built in backup mechanism and manually configure the backup in S3 to be replicated to other region.

4. Configure MemoryDB for Redis Backup

MemoryDB for Redis is an optional service used for aggregation and distributed storage. MemoryDB has built in cross-availability zone replication with up to 5 replicas in different availability zones. This does not have to be configured.

To protect against multi availability-zone failures (unlikely) or human errors (more likely) snapshot backups to different region can be enabled. This is described in https://docs.aws.amazon.com/memorydb/latest/devguide/snapshots-automatic.html. Follow these instructions to configure:

Snapshot window - A period during each day when MemoryDB begins creating a snapshot. The minimum length for the snapshot window is 60 minutes. Although not required for consistency, it is recommended to align this with your cold backup window (if enabled) as configured in step 1.
Snapshot retention limit – The number of days the snapshot is retained in Amazon S3.

5. Kubernetes Managed Resources

Kubernetes managed resources are:

Usage Engine platform application
ECDs - Solution deployment descriptors
Supporting applications like Prometheus, Grafana and ElasticSearch (note that the state and data held by these supporting applications are not covered by the scope of this document)

It is recommended to install all Kubernetes managed resources as Helm charts managed by automated CI/CD pipelines. The process to setup such a pipeline is described in Installation Procedure for CI Pipeline(4.3)

If Kubernetes resources are managed by CI/CD, there is no need to backup the deployed resources in the EKS cluster. If for some reason CI/CD is not set up, Kubernetes resources should be backed up to be recoverable if disaster happens. To backup Kubernetes managed resources in EKS, the recommended tool to use is Velero. See documentation on https://velero.io/ for information on how to set this up.

Backup Pricing

Pricing of backup data is based on volumes and are defined at https://aws.amazon.com/backup/pricing/ and https://docs.aws.amazon.com/memorydb/latest/devguide/snapshots-costs.html.