Kafka Batch Collection Agent Configuration

You open the Kafka collection agent configuration dialog from a workflow configuration. Click  Build → New Configuration. Select Workflow from the Configurations dialog. When prompted to Select workflow type, select Batch. Click Add agent and select Kafka from the Collection tab in the Agent Selection dialog. Double-click on the agent in the workflow template.

kafka_batch_collection.png
Kafka batch collection agent configuration

Setting

Description

Setting

Description

Consumer Group

The name of the consumer group which is defined by the group.id property.

Batch Size

The number of messages to collect per batch. Note that a workflow will remain in running state even when all data has been consumed and you have to trigger a stop either using mzcli, or the operations REST interface or in the workflow configuration by using a MIM like Estimated Lag or Workflow Throughput, for example.

The Batch Size must be greater than 0.

Note!

For performance reasons, it is important to set a reasonable batch size. If the batch size is set too low, this will affect the performance negatively.

Assignment

Messages can be collected from one or several topics. You define how the topics should be identified by selecting either;

Topic Pattern
Enter a regular expression for the names of the topics you want to collect data from. https://docs.oracle.com/en/java/javase/15/docs/api/java.base/java/util/regex/Pattern.html

Topic Names
Select this option to display a list and an Add button. Add one or several topic names that you want to collect data from. The exact names must be entered. Regular expressions cannot be used.

Topic Partitions
Select this option to display a list and an Add button. Add one or several topic names and partitions that you want to collect data from. The exact names must be entered. Regular expressions cannot be used.

Below are a few examples of valid partition declarations:

Example of collection from partition 0:

Partitions: 0

Example of collection from the three partitions 0, 8 and 12:

Partitions: 0,8,12

Example of collection from the six partitions 0, 3, 4, 5, 6, and 7:

Partitions: 0,3-7

Note!

If you select Topic Partitions, automatic rebalancing will not take place, and you will have to handle potential rebalancing manually if needed.

Note!

You must configure a Kafka Profile in the Execution tab in Workflow Properties for the workflow configuration to be valid.

Â