...
Usage Engine Private Edition now supports batch scaling, making it possible to increase or decrease processing capacity as needed without manual intervention. As a general concept, batch scaling is a way to speed up processes by splitting the workload between multiple “workers” or resources, enabling them to complete tasks in parallel rather than sequentially. Usage Engine’s solution consists of three two new agents, A Scalable File Collection agent and a Scalable InterWF Forwarder and a Scalable InterWF Forwarder and Collector agentsagent. A new profile has also been created - the Partition Profile. The feature uses the existing agents, Data Aggregator and Deduplication, which have been updated to include a Kafka storage profile. Kafka must be configured for all storage within your batch scaling solution. Add something here about recommended use cases as per the note above?
...
This section contains the following subsections:
...
...
...
Child pages (Children Display) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Info |
---|
From Chat GPT re: Topics - For draft purposes only: In comparison, a cache is a direct storage solution intended for fast access to data. Topics, on the other hand, are about managing and distributing data streams efficiently across systems. |
Info |
---|
From chat with Michal: How does the new solution differ from what users can configure now? The information on Automatic Scale Out and Rebalancing (4.3) is not related to batch scaling. It references Kafka doing some partitioning work based on what is configured in the Kafka agent. DRs new Batch scaling solution does the partitioning work within the inter-WF agents. How does the new solution know when to scale? Is it based on the number of raw data files that get collected at any one time? - right now you have to manually configure your ECD to scale based on a known metric i.e. if the data file amount is over 1000 files |
...
then… Look at the example image from the doc: is it the File collection workflow that creates the partitions? not really, but it is sort of the scalable InterWF forwarding agent or as |
...
Michal says - any agent using the Partition profile. It creates the partitions based on the Max Scale Factor paramater? True - says Michal - this will set the max number of parallel workflows as well. Where is the Max scale factor parameter located? In the Partition Profile configuration. Our example shows 3 workflows - Does there have to be exactly 3 workflows in a solution? Is there a minimum/maximum amount of workflows needed to create a working solution? there is no maximum or minimum amount of workflows required. Are there any prerequisites required to be able to configure batch scaling using Kafka storage? yes - your workflow has to be designed in a way that can process batch workflows for example there has to be at least 1 common denominator in the data that links individual records. |