Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Private_Edition_flag.png

Read this guide to assist you when creating your specific batch scaling solution. One key thing to note is that you cannot mix standard agents with scaling agents in the same workflow. This is because workflows with standard agents save the state in Usage Engine whereas workflows with batch agents save the state in Kafka.

  1. Always start with a Batch Scaling Collection Workflow that collects from the original file source and forwards UDRs to Kafka.

  2. The Batch Scaling Processing Workflows can be one or a series of workflows. Batch Duplication Check and Aggregation can be part of the same workflow. There can only be ONE Aggregation Agent and ONE Duplication Agent per workflow.

  3. Decide how many maximum workflows that can execute in parallel, i.e. how can you efficiently shard your data (distribute it evenly) in different groups. Then you need to decide how to decide which identifier / sorting parameter the workflow should use to distribute the UDR. Typically a field based on record group / ID / number etc. If there is no such field, use APL to create and populate such a field (round-robin among shards for instance).

UI Parameters

Parameter

Comment

ID Field

Defines how to match a UDR to a partition.

Max Scale Factor

Number of partitions, which is the same as maximum number of workflows that can execute in parallell.

Note!

If any of the parameters needs to be changed, it is considered a new configuration, and they need to start with empty topics.

You can use the existing data, but you must use the standard Kafka Agents and migrate the data. Or do we even want to mention this?

Scaling design

PE will scale out and in and re-balance automatically. You can also schedule a scale-out (and scale-in).

  1. “Packaging” a scale-out configuration:
    Use the regular ECD (Execution context deployment) definition using Dynamic Workflows to define how to package a scale-out. For instance:

    1. A Collection Workflow scales with 1 extra Workflow per ECD.

    2. A Processing Workflow scales with 3 extra Workflows per ECD.

    3. Or combine the above into the same ECD.

  2. Scheduling a scale-out configuration:

    1. Automatic; the system will scale automatically based on metrics.

    2. Manual; schedule the ECD and Workflow start up (or stop).

Automatic Scaling

Manual Scaling

  • Based on Metric.

  • Should also have some “duration” of the metric to avoid oscillating behavior?

  • You can start up ECDs manually.

  • We have a way of scheduling ECDs as well.

  • No labels