This section describes the standard upgrade procedure for upgrading Usage EngineĀ on Microsoft Azure. This procedure should be used to ensure that configurations, persisted data (in the file system or database), and system properties are properly migrated. Workflows will be interrupted shortly at the restart of the ECDs. These instructions will also describe how the timing of the restart of ECDs running non-scalable real-time workflows can be controlled by enabling manual upgrade. Before doing anything to the running installation, the config file for the new installation should be prepared by following these steps: Retrieve the Where Extract the values manually from the output. Copy the lines below “USER-SUPPLIED VALUES:” and stop at the blank line before “COMPUTED VALUES:”. Save the copied content to the config file Update helm repository to get the latest helm chart versions by running the following command. Retrieve the new version from the repository by running the following command. Refer to Release Information for the Helm Chart version. For example: Next, check the file CHANGELOG.md inside the created folder to find out what may have changed in the new version when it comes to the values-file. means that in the values file they should be entered as: Each part of the key does not necessarily follow directly after the previous one, but always before any other “parent” on the same level. So in this example of a an example of a key could be Make any necessary updates based on changed field you may be using in the Take note of any fields that have been deprecated or removed since the last version so any configuration of those fields can be replaced. Note! Before proceeding with the upgrade make sure : you logged in and have access the container registry. you have a valid image pull secret that allows the Kubernetes cluster to pull the container images from the container registry. update the Image Pull Secret (if needed) to the update the License Key for the upgrade version to the When you have updated the Preparations
values.yaml
file that you have used previously, or if you want to start from scratch, you extract it from the installation by running these commands:helm -n <namespace> get all <helm name>
E.g:
helm -n uepe get all uepe
uepe
is the helm name you have selected for your installation. You will see list similar to the one below.helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
external-dns uepe 1 2024-05-08 15:27:48.258978207 +0200 CEST deployed external-dns-7.2.0 0.14.1
ingress-nginx uepe 1 2024-05-08 16:18:43.919980224 +0200 CEST deployed ingress-nginx-4.10.0 1.10.0
uepe uepe 3 2024-05-10 14:16:17.724426589 +0200 CEST deployed usage-engine-private-edition-4.0.0 4.0.0
valuesFromSystem.yaml
.helm repo list
helm repo update
helm fetch <repo name>/usage-engine-private-edition --version <version> --untar
helm fetch digitalroute/usage-engine-private-edition --version 4.0.0 --untar
If you are uncertain about how to interpret the content of the file, see below for some examples of keys and how to interpret them:The following values have been removed:
* ```mzOperator.clusterWide```
* ```mzOperator.experimental.performPeriodicWorkflowCleanup```
* ```jmx.remote```
* ```platform.debug.jmx```
mzOperator:
clusterWide:
experimental:
performPeriodicWorkflowCleanup
jmx:
remote:
platform:
debug:
jmx:
values.yaml
file:debug:
script:
enabled: false
log:
level:
codeserver: info
jetty: 'off'
others: warn
debug.log.level.jetty
.valuesFromSystem.yaml
file you got from the existing installation so it matches the new version.valuesFromSystem.yaml
file.valuesFromSystem.yaml
file.valuesFromSystem.yaml
file you can test it by running this command:helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml --dry-run=server
Before you start the actual upgrade, these steps are recommended to avoid issues in processing caused by the restarts during the upgrade: Disable any batch workflow groups and let any running batch workflows finish their runs. For real-time workflows, check which types of real-time workflows the ECs are running. If an ECD hosts workflows that allow for scaling and use an ingress for incoming traffic, the ECD will, by default, be upgraded through a rolling upgrade, which means that there will always be at least one workflow running even during the upgrade. Example - Editing ECD to Manual Upgrade Option 1 Run the following command: And change manualUpgrade to true: Option 2 Run the following command: When the upgrade is completed, the ECDs can be upgraded by editing the ECD in Desktop Online.
However, if the real-time workflow does not support scaling, for example, because it uses fixed ports or storage that is not shared, the EC will become unavailable for a certain time during the upgrade. To gain control over when the EC becomes unavailable, you can edit the ECD by setting manualUpgrade
to true
before the upgrade. With this setting, the ECD will keep running on the old version until the upgrade has been performed and it can then be restarted on the new version in the EC Deployment Interface (4.3).kubectl get ecd -n <namespace>
kubectl edit ecd <ecd-name> -n namespace
spec:
.....
manualUpgrade: true
kubectl patch ecdeployment <ecd-name> -n <namespace> --type=merge -p $'spec:\n manualUpgrade: true'
When all the running batch workflows have stopped you should make a backup so that the system can be restored in case of any issues during the upgrade. Note! Before proceeding with the backup you must shut down the platform. This is very important since otherwise the backup of the database may become corrupt. The platform can be shut down in various ways, see examples below. Examples - Shutting Down the Platform Option 1 Reduce the number of replicas (under “spec”) to 0 by running the following command: where uepe is the namespace used. Option 2 Run this command: and then this command: And ensure that the pod platform-0 is no longer presentBackup and Database Upgrade
kubectl edit statefulset platform -n uepe
kubectl scale --replicas=0 sts/platform -n uepe
kubectl get pods -n uepe
Note!
The instructions for backup and upgrade of the database below are only relevant if you are using Azure Database for PostgreSQL - Flexible Server as platform database. If the platform database used is Derby, the backup of the Azure Files Storage covers the database as well (assuming persistent storage of the platform is enabled).
For database backup, please refer to https://learn.microsoft.com/en-us/azure/backup/backup-azure-database-postgresql-flex for guidance.
The next step is to do a backup of the file system used.
Note!
If there are standalone ECs that are still running and writing their logs to the same file storage , whatever happens after the backup has been initiated will not be included in the backup.
To create an Azure File share backup, see https://learn.microsoft.com/en-us/azure/backup/backup-azure-files?tabs=backup-center or https://learn.microsoft.com/en-us/azure/backup/backup-afs-cli for instructions.
The section below contains an example of how to create a backup vault, followed by enabling an Azure File share backup protection and performing an on-demand backup via the command line.
export RESOURCE_GROUP=PT_Stratus export LOCATION="Southeast Asia" export STORAGE_ACCOUNT_NAME=uepeaks export STORAGE_ACCOUNT_KEY=$(az storage account keys list --account-name $STORAGE_ACCOUNT_NAME --query "[0].value") export STORAGE_ACCOUNT_ID=$(az storage account show --resource-group $RESOURCE_GROUP --name $STORAGE_ACCOUNT_NAME --query "id") export SUBSCRIPTION_ID=$(az account subscription list --query "[0].subscriptionId" | tr -d '"') export FILE_SHARE=$(az storage share list --account-name $STORAGE_ACCOUNT_NAME --account-key $STORAGE_ACCOUNT_KEY --query "[0].name" | tr -d '"') export FILE_BACKUP_VAULT=azurefilesvault export FILE_BACKUP_POLICY=FileBackupPolicy # Create new file backup vault az backup vault create --resource-group $RESOURCE_GROUP --name $FILE_BACKUP_VAULT --location $LOCATION --output table az backup vault list --query "[].{Name:name}" # Create new file backup policy # References: # https://learn.microsoft.com/en-us/azure/backup/manage-afs-backup-cli#create-policy # https://learn.microsoft.com/en-us/azure/templates/microsoft.recoveryservices/vaults/backuppolicies?pivots=deployment-language-bicep#property-values cat <<-EOF > $FILE_BACKUP_POLICY.json { "eTag": null, "id": "/Subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.RecoveryServices/vaults/$FILE_BACKUP_VAULT/backupPolicies/$FILE_BACKUP_POLICY", "location": null, "name": "$FILE_BACKUP_POLICY", "properties": { "backupManagementType": "AzureStorage", "protectedItemsCount": 0, "retentionPolicy": { "dailySchedule": { "retentionDuration": { "count": 30, "durationType": "Days" }, "retentionTimes": [ "2024-07-19T03:00:00+00:00" ] }, "monthlySchedule": null, "retentionPolicyType": "LongTermRetentionPolicy", "weeklySchedule": null, "yearlySchedule": null }, "schedulePolicy": { "schedulePolicyType": "SimpleSchedulePolicy", "scheduleRunDays": null, "scheduleRunFrequency": "Daily", "scheduleRunTimes": [ "2024-07-19T03:00:00+00:00" ], "scheduleWeeklyFrequency": 0 }, "timeZone": "UTC", "workLoadType": "AzureFileShare" }, "resourceGroup": "$RESOURCE_GROUP", "tags": null, "type": "Microsoft.RecoveryServices/vaults/backupPolicies" } EOF az backup policy list --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --query "[].{Name:name}" az backup policy create --policy $FILE_BACKUP_POLICY.json --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --name $FILE_BACKUP_POLICY --backup-management-type AzureStorage az backup policy show --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --name $FILE_BACKUP_POLICY # Enable Azure File share backup protection az backup protection enable-for-azurefileshare --vault-name $FILE_BACKUP_VAULT --resource-group $RESOURCE_GROUP --policy-name $FILE_BACKUP_POLICY --storage-account $STORAGE_ACCOUNT_NAME --azure-file-share $FILE_SHARE --output table # Command output as below: # Name ResourceGroup # ------------------------------------ --------------- # 2b85d01d-9a27-4a5a-aa9d-cbdad082cac2 PT_Stratus # Track job status az backup job show --name 2b85d01d-9a27-4a5a-aa9d-cbdad082cac2 --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT # Retrieve container registered to the Recovery services vault and export as env variable export CONTAINER_NAME=$(az backup container list --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --backup-management-type AzureStorage --query "[0].name" | tr -d '"') # Retrieve backed up item and export as env variable export ITEM_NAME=$(az backup item list --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --query "[0].name" | tr -d '"') # Perform on-demand backup az backup protection backup-now --vault-name $FILE_BACKUP_VAULT --resource-group $RESOURCE_GROUP --container-name $CONTAINER_NAME --item-name $ITEM_NAME --retain-until 20-01-2025 --output table # Command output as below: # Name Operation Status Item Name Backup Management Type Start Time UTC Duration # ------------------------------------ ----------- ---------- ---------------------- ------------------------ -------------------------------- -------------- # 23300e34-b1e0-409c-804e-c247d4587f8f Backup InProgress uepe-aks-storage-share AzureStorage 2024-07-19T11:01:07.436164+00:00 0:00:02.178697
To perform the actual upgrade you should use the same command as the test command described earlier minus the If the upgrade was successful, the output will look like this: Scale up the platform stateful set again so the platform starts back up using the following command:Upgrade
--dry-run=server
flag, for example like this:helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml
helm upgrade --install uepe digitalroute/usage-engine-private-edition --atomic --cleanup-on-fail --version 4.0.0 -n uepe -f valuesFromSystem.yaml
Release "uepe" has been upgraded. Happy Helming!
NAME: uepe
LAST DEPLOYED: Fri May 10 15:02:37 2024
NAMESPACE: uepe
STATUS: deployed
REVISION: 4
TEST SUITE: None
NOTES:
Usage Engine Private Edition 4.0.0 has been deployed successfully!
Check out the CHANGELOG.md in this chart for information about what has been changed, added and removed in this version.
kubectl scale --replicas=1 sts/platform -n uepe
When the Usage Engine installation has been upgraded, ensure that any ECDs supporting rolling upgrade are still running as expected. If there are ECDs that have been configured for manual upgrade before the upgrade, see the section below. Also ensure to enable any batch workflow groups again so that the batch processing can start again.After Upgrade
If you configured any ECDs to manual upgrade before the upgrade, follow these steps to upgrade these ECDs when the regular upgrade is completed: Login to Desktop Online, see Desktop Online User Interface (4.3). Go to the EC Deployment Interface (4.3) in the Manage view in Desktop Online. Click on the ECD(s) to view the warnings. If there are ECDs that need to be upgraded, you will see a Message saying that it needs to be upgraded for each ECD. Go back to the list of ECDs, click on the three dots to far right in the ECD row, and select the Upgrade option in the pop-up menu.Manual Upgrade of ECDs
You will see a warning symbol next to the relevant ECDs.
Rollback procedure only be carried out in case user wants to rollback to the previous version. The following steps are performed in rollback. Restore database backup Restore file system snapshot Rollback Usage Engine Private Edition to pre-upgrade versionRollback
Restore Database Backup
You can restore a database backup into Azure Blob Storage and use PostgreSQL native tool pg_restore to restore data to a new PostgreSQL flexible server database, see https://learn.microsoft.com/en-us/azure/backup/restore-azure-database-postgresql-flex for detailed steps.
Note!
The restored PostgreSQL flexible server is a new database instance and is not managed by Terraform. If you plan to destroy the cluster later, ensure that the new database instance is deleted first.
Restore File System Snapshot
To restore an Azure File share, follow the instructions from https://learn.microsoft.com/en-us/azure/backup/restore-afs?tabs=full-share-recovery or https://learn.microsoft.com/en-us/azure/backup/restore-afs-cli.
The section below contains an example of how to restore an Azure File backup using the command line. In this example the backup is restored to the existing File share. If you wish to restore to a new File share instance, you need to adjust accordingly.
export RESOURCE_GROUP=PT_Stratus export LOCATION="Southeast Asia" export STORAGE_ACCOUNT_NAME=uepeaks export STORAGE_ACCOUNT_KEY=$(az storage account keys list --account-name $STORAGE_ACCOUNT_NAME --query "[0].value") export STORAGE_ACCOUNT_ID=$(az storage account show --resource-group $RESOURCE_GROUP --name $STORAGE_ACCOUNT_NAME --query "id") export SUBSCRIPTION_ID=$(az account subscription list --query "[0].subscriptionId" | tr -d '"') export FILE_SHARE=$(az storage share list --account-name $STORAGE_ACCOUNT_NAME --account-key $STORAGE_ACCOUNT_KEY --query "[0].name" | tr -d '"') export FILE_BACKUP_VAULT=azurefilesvault export FILE_BACKUP_POLICY=FileBackupPolicy export CONTAINER_NAME=$(az backup container list --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --backup-management-type AzureStorage --query "[0].name" | tr -d '"') export ITEM_NAME=$(az backup item list --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT --query "[0].name" | tr -d '"') # Fetch recovery points az backup recoverypoint list --vault-name $FILE_BACKUP_VAULT --resource-group $RESOURCE_GROUP --container-name $CONTAINER_NAME --backup-management-type azurestorage --item-name $ITEM_NAME --workload-type azurefileshare --out table # Command output as below: # Name Time Consistency # -------------- ------------------------- -------------------- # 68988215529834 2024-07-19T11:01:09+00:00 FileSystemConsistent # Full restore snapshot to existing file share az backup restore restore-azurefileshare --vault-name $FILE_BACKUP_VAULT --resource-group $RESOURCE_GROUP --rp-name 68988215529834 --container-name $CONTAINER_NAME --item-name $ITEM_NAME --restore-mode originallocation --resolve-conflict overwrite --out table # Track job status az backup job show --name 249c1bbb-da9f-4b3b-b612-f9917ea2cecd --resource-group $RESOURCE_GROUP --vault-name $FILE_BACKUP_VAULT
To rollback to pre-upgrade version, check the history to see the revision numbers Rollback to pre-upgrade version with revision <pre-upgrade-revision-number>Rollback Usage Engine Private Edition to pre-upgrade version
helm history uepe -n uepe
helm rollback uepe <pre-upgrade-revision-number> -n uepe