This section describes the standard upgrade procedure for upgrading Usage Engine on Oracle Cloud Infrastructure. This procedure should be used to ensure that configurations, persisted data (in the file system or database), and system properties are properly migrated. Workflows will be interrupted shortly at the restart of the ECDs. These instructions will also describe how the timing of the restart of ECDs running non-scalable real-time workflows can be controlled by enabling manual upgrade.
Preparations
Before doing anything to the running installation, the config file for the new installation should be prepared by following these steps:
Retrieve the
values.yaml
file that you have used previously, or if you want to start from scratch, you extract it from the installation by running these commands:helm -n <namespace> get all <helm name> E.g: helm -n uepe get all uepe
Where
uepe
is the helm name you have selected for your installation. You will see list similar to the one below.helm list NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION external-dns uepe 1 2024-05-08 15:27:48.258978207 +0200 CEST deployed external-dns-7.2.0 0.14.1 ingress-nginx uepe 1 2024-05-08 16:18:43.919980224 +0200 CEST deployed ingress-nginx-4.10.0 1.10.0 uepe uepe 3 2024-05-10 14:16:17.724426589 +0200 CEST deployed usage-engine-private-edition-4.0.0 4.0.0
Extract the values manually from the output. Copy the lines below “USER-SUPPLIED VALUES:” and stop at the blank line before “COMPUTED VALUES:”. Save the copied content to the config file
valuesFromSystem.yaml
.Update helm repository to get the latest helm chart versions by running the following command.
helm repo list helm repo update
Retrieve the new version from the repository by running the following command. Refer to Release Information for the Helm Chart version.
helm fetch <repo name>/usage-engine-private-edition --version <version> --untar
For example:
helm fetch digitalroute/usage-engine-private-edition --version 4.0.0 --untar
Next, check the file CHANGELOG.md inside the created folder to find out what may have changed in the new version when it comes to the values-file.
If you are uncertain about how to interpret the content of the file, see below for some examples of keys and how to interpret them:The following values have been removed: * ```mzOperator.clusterWide``` * ```mzOperator.experimental.performPeriodicWorkflowCleanup``` * ```jmx.remote``` * ```platform.debug.jmx```
means that in the values file they should be entered as:
mzOperator: clusterWide: experimental: performPeriodicWorkflowCleanup jmx: remote: platform: debug: jmx:
Each part of the key does not necessarily follow directly after the previous one, but always before any other “parent” on the same level. So in this example of a
values.yaml
file:debug: script: enabled: false log: level: codeserver: info jetty: 'off' others: warn
an example of a key could be
debug.log.level.jetty
.Make any necessary updates based on changed field you may be using in the
valuesFromSystem.yaml
file you got from the existing installation so it matches the new version.Take note of any fields that have been deprecated or removed since the last version so any configuration of those fields can be replaced.
Note!
Before proceeding with the upgrade make sure :
you logged in and have access the container registry.
you have a valid image pull secret that allows the Kubernetes cluster to pull the container images from the container registry.
update the Image Pull Secret (if needed) to the
valuesFromSystem.yaml
file.update the License Key for the upgrade version to the
valuesFromSystem.yaml
file.
When you have updated the
valuesFromSystem.yaml
file you can test it by running this command:
helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml --dry-run=server
Preparing ECDs
Before you start the actual upgrade, these steps are recommended to avoid issues in processing caused by the restarts during the upgrade:
Disable any batch workflow groups and let any running batch workflows finish their runs.
For real-time workflows, check which types of real-time workflows the ECs are running. If an ECD hosts workflows that allow for scaling and use an ingress for incoming traffic, the ECD will, by default, be upgraded through a rolling upgrade, which means that there will always be at least one workflow running even during the upgrade.
However, if the real-time workflow does not support scaling, for example, because it uses fixed ports or storage that is not shared, the EC will become unavailable for a certain time during the upgrade. To gain control over when the EC becomes unavailable, you can edit the ECD by settingmanualUpgrade
totrue
before the upgrade. With this setting, the ECD will keep running on the old version until the upgrade has been performed and it can then be restarted on the new version in the EC Deployment Interface (4.3).
Example - Editing ECD to Manual Upgrade
Option 1
Run the following command:
kubectl get ecd -n <namespace> kubectl edit ecd <ecd-name> -n namespace
And change manualUpgrade to true:
spec: ..... manualUpgrade: true
Option 2
Run the following command:
kubectl patch ecdeployment <ecd-name> -n <namespace> --type=merge -p $'spec:\n manualUpgrade: true'
When the upgrade is completed, the ECDs can be upgraded by editing the ECD in Desktop Online.
Backup and Database Upgrade
When all the running batch workflows have stopped you should make a backup so that the system can be restored in case of any issues during the upgrade.
Note!
Before proceeding with the backup you must shut down the platform. This is very important since otherwise the backup of the database may become corrupt.
The platform can be shut down in various ways, see examples below.
Examples - Shutting Down the Platform
Option 1
Reduce the number of replicas (under “spec”) to 0 by running the following command:
kubectl edit statefulset platform -n uepe
where uepe is the namespace used.
Option 2
Run this command:
kubectl scale --replicas=0 sts/platform -n uepe
and then this command:
kubectl get pods -n uepe
And ensure that the pod platform-0 is no longer present
Backup the database by following these steps:
List the databases and locate the one used for Usage Engine:
oci psql db-system-collection list-db-systems --compartment-id <compartment_OCID> --all
Perform a backup of the database by running the following command:
oci psql backup create --compartment-id <compartment_OCID> --db-system-id <db_system_OCID> --display-name <backup_display_name>
Check if the backup was created successfully by running this command:
oci psql backup-collection list-backups --compartment-id <compartment_OCID> --display-name <backup_display_name> --all
Get details about a backup by running this command:
oci psql backup get --backup-id <backup_OCID>
The next step is to do a backup of the file system used.
Note!
If there are standalone ECs that are still running and writing their logs to the same file system, events happening after the backup has been initiated will not be included in the backup.
oci fs snapshot create --file-system-id <file_system_OCID> --name "<snapshot_name>" --expiration-time <2023-09-15T20:30Z>
After creation, snapshots are accessible under the root directory of the file system at .snapshot/<snapshot_name>
.
Upgrade
To perform the actual upgrade you should use the same command as the test command described earlier minus the --dry-run=server
flag, for example like this:
helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml
If the upgrade was successful, the output will look like this:
helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml Release "uepe" has been upgraded. Happy Helming! NAME: uepe LAST DEPLOYED: Fri May 10 15:02:37 2024 NAMESPACE: uepe STATUS: deployed REVISION: 4 TEST SUITE: None NOTES: Usage Engine Private Edition 4.0.0 has been deployed successfully! Check out the CHANGELOG.md in this chart for information about what has been changed, added and removed in this version.
Scale up the platform stateful set again so the platform starts back up using the following command:
kubectl scale --replicas=1 sts/platform -n uepe
After Upgrade
When the Usage Engine installation has been upgraded, ensure that any ECDs supporting rolling upgrade are still running as expected. If there are ECDs that have been configured for manual upgrade before the upgrade, see the section below.
Also ensure to enable any batch workflow groups again so that the batch processing can start again.
Manual Upgrade of ECDs
If you configured any ECDs to manual upgrade before the upgrade, follow these steps to upgrade these ECDs when the regular upgrade is completed:
Login to Desktop Online, see Desktop Online User Interface (4.3).
Go to the EC Deployment Interface (4.3) in the Manage view in Desktop Online.
You will see a warning symbol next to the relevant ECDs.Click on the ECD(s) to view the warnings. If there are ECDs that need to be upgraded, you will see a Message saying that it needs to be upgraded for each ECD.
Go back to the list of ECDs, click on the three dots to far right in the ECD row, and select the Upgrade option in the pop-up menu.
Rollback
Rollback procedure only be carried out in case user wants to rollback to the previous version. The following steps are performed in rollback.
Restore database backup
Restore file system snapshot
Rollback Usage Engine Private Edition to pre-upgrade version
Restore Database Backup
To restore the backup, run these commands:
oci psql db-system restore --db-system-id <db_system_OCID> --backup-id <backup_OCID>
Restore File System Snapshot
To restore a file system snapshot you need to access the snapshots via a Kubernetes Pod.
You must know the mount target IP address prior to mounting the file system onto the Pod. Before running the Pod, retrieve the mount target IP address from PV output:
kubectl get pv <pv-name> -o jsonpath='{.spec.csi.volumeHandle}'|awk -F: '{print $2}'
Ensure to save the IP address as it will be used as input to mount the file system onto the Pod later.
To run a Kubernetes Pod, run this command:
kubectl run nfscli --rm --tty -i --restart='Never' --namespace uepe --image oraclelinux --privileged=true --command -- bash
You will see the command prompt if the Pod created and running. The command prompt indicates that you have logged in to the Pod successfully.
user1@user1-MacBook-Pro ~ % kubectl run nfscli --rm --tty -i --restart='Never' --namespace uepe --image oraclelinux --privileged=true --command -- bash If you don't see a command prompt, try pressing enter. [root@nfscli /]# [root@nfscli /]#
On the pod, download and install nfs-utils. These are libraries and executables needed to access the File System Storage.
[root@nfscli /]# yum -y install nfs-utils
On the pod, create the mount directory. This is the directory where the mounted file system is located.
[root@nfscli /]# mkdir -p /mnt/uepe
On the pod, mount the file system with the mount target IP address that you retrieved and saved prior to running the pod.
[root@nfscli /]# mount -o nolock <IP address of the mount target>:/uepe /mnt/uepe
On the pod, list the mount directory.
[root@nfscli /]# ls -al /mnt/uepe/ total 3 drwxrwsr-x. 9 root 6000 7 Aug 14 06:51 . drwxr-xr-x. 1 root root 18 Aug 15 02:34 .. drwxrwsr-x. 5 root 6000 3 Aug 15 02:34 .snapshot drwxrwsr-x. 2 6000 6000 0 Aug 14 03:43 3pp drwxrwsr-x. 3 6000 6000 1 Aug 15 02:00 backup drwxrwsr-x. 2 6000 6000 0 Aug 14 03:43 jni drwxrwsr-x. 2 root 6000 0 Aug 14 03:43 keys drwxrwsr-x. 5 6000 6000 3 Aug 14 06:57 log drwxrwsr-x. 3 6000 6000 1 Aug 14 05:12 pico-cache drwxrwsr-x. 2 6000 6000 0 Aug 14 03:43 storage
On the pod, list the available snapshots. These directories listed under .snapshot/ are the <snapshot_name
> provided during backup of the file system in previous section.
[root@nfscli /]# ls -al /mnt/uepe/.snapshot/ total 3 drwxrwsr-x. 5 root 6000 3 Aug 15 02:35 . drwxrwsr-x. 9 root 6000 7 Aug 14 06:51 .. drwxr-xr-x. 2 root 6000 1 Aug 8 07:26 snapshot-aug-1 drwxr-xr-x. 2 root 6000 2 Aug 8 10:01 snapshot-aug-2 drwxrwsr-x. 9 root 6000 7 Aug 14 06:51 snapshot-aug-3
To restore these snapshots within the Pod, copy these snapshots to the destination directory.
[root@nfscli /]# cp -rp /mnt/uepe/.snapshot/<snapshot_name>/* <destination_directory_name>
For example, to restore “snapshot-aug-3
“ snapshot to platform volume mount directory, follow these steps:
Clean up the existing platform volume mount directory.
[root@nfscli /]# rm -rf /mnt/uepe/*
Copy data from “
snapshot-aug-3
“ to platform volume mount directory.
Note!
Always double check the file permission when you copy or move these snapshots. You want to preserve the file permission when you copy or move
[root@nfscli /]# cp -rp /mnt/uepe/.snapshot/snapshot-aug-3/* /mnt/uepe/
Unmount the file system if restore completed successfully.
[root@nfscli /]# umount /mnt/uepe/
Exit the Pod
[root@nfscli /]# exit
Rollback Usage Engine Private Edition to pre-upgrade version
To rollback to pre-upgrade version, check the history to see the revision numbers
helm history uepe -n uepe
Rollback to pre-upgrade version with revision <pre-upgrade-revision-number>
helm rollback uepe <pre-upgrade-revision-number> -n uepe