Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Private_Edition_flag.pngImage ModifiedIntroduction.. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur accumsan malesuada leo sed fermentum. Nullam non vehicula ligula, ut facilisis nisl. Curabitur at iaculis nisl, sit amet luctus justo. Cras hendrerit quam orci, eget consectetur eros congue eu. Aliquam pulvinar tempor mi, quis efficitur quam cursus a.

Include an image of an example workflow..

Workflow design guide

  1. Always start with a Batch Scaling Collection Workflow that collects from the original file source and forwards UDRs to Kafka.

  2. The Batch Scaling Processing Workflows can be one or a series of workflows. Batch Duplication Check and Aggregation can be part of the same workflow. There can only be ONE Aggregation Agent and ONE Duplication Agent per workflow.

  3. Decide how many maximum workflows that can execute in parallell, i.e. how can you shard your data in an efficient way / distribute evenly in different groups. Then you need to decide how to decide which identifier / sorting parameter the workflow should use to distribute the UDR. Typically a field based on record group / ID / number etc. If there is no such field, use APL to create and populate such a field (round-robin among shards for instance). UI Parameters:

...

Parameter

...

Comment

...

ID Field

...

Defines how to match a UDR to a partition.

...

Max Scale Factor

...

Number of partitions, which is the same as maximum number of workflows that can execute in parallell.

Note!

If any of the parameters neds to be changed, it is considered a new configuration, and they need to start with empty topics.

You can use the existing data, but you must use the standard Kafka Agents and migrate the data. Or do we even want to mention this?

Scale out/in Design

PE When creating a scalable batch workflow in Usage Engine, it’s important to ensure that all agents with storage capabilities are configured to use Kafka storage. Additionally, scalable workflows require scalable Inter Workflow Collection and Forwarding Agents, as regular Inter Workflow Agents are not compatible. Mixing agents with different storage types, such as a Data Aggregator agent configured with Kafka storage and another with file storage, within the same workflow is not supported.

Creating a scalable solution (example)

These are high-level steps to creating a scalable batch solution in Usage Engine. The following example solution is made up of several profiles including the newly created Partition Profile (4.3) and Scalable Inter Workflow Profile (4.3), and two workflow types, Batch Scaling Collection and Batch Scaling Processing.

  1. Decide on your scaling factor, this will be the maximum number of workflows that can effectively cooperate to process a batch. This is an important choice and will be difficult to change once your workflows are in production.

Note

Warning!
Try to pick a Max Scale Factor that is divisible by many other numbers, like 6 or 12. You need to ensure that it is high enough to handle the data coming in, but not so high that you will overload resources.

  1. You must choose one or more fields in your UDRs that will be used to partition data. These fields may be based on a record group like a customer ID or an account number.

  2. Create a Kafka Profile pointing to your cluster

  3. Create a Partition Profile where you define your Max Scale Factor and your partitioning fields.

  4. Create the Aggregation, Duplicate UDR, and Scalable Inter Workflow profiles and link the Partition Profile created in Step 2 to each.

  5. Create your workflows.

    • Standard workflows - prepare data for scaling by sending it to the scalable InterWF Forwarder

    • Scalable processing workflows - collect data with a Scalable InterWF Collector.

Note

Warning!
When creating a scalable workflow you must add the Kafka profile in the execution tab of the workflow properties.

Note!
You can include multiple Aggregation and Duplicate UDR agents within the same workflow. These agents can either share the same Partition Profile or use different Aggregation and Duplicate UDR Profiles. For instance, you might use different profiles if you need to apply a different ID field as the Key in storage.

Scaling Batch Workflows

Usage Engine will scale out and in and re-balance automatically. You can also schedule scalable batch workflows automatically and you can schedule when to start a scale-out (and or scale-in).

...

Deploying a scale-out configuration with ECDs:

Use

...

https://infozone.atlassian.net/wiki/x/IgMkEg with Dynamic Workflows to define how to package a scale

...

  1. Collection WF scales with 1 (one) extra WF per ECD.

  2. Processing WF scales with 3 (three) extra WFs per ECD.

  3. Or combine the above into the same ECD.

...

-out

...

  1. Automatic; the system will scale automatically based on metrics.

  2. Manual; schedule the ECD and WF start up (or stop).

...

Automatic Scaling

...

Manual Scaling

...

  • Based on Metric.

  • Should also have some “duration” of the metric to avoid oscillating behavior?

...

You can start up ECDs manually.

...

. See the tabs on https://infozone.atlassian.net/wiki/x/VgQkEg for more information.