Take into account the following behaviors when using the Aggregation profile:
...
The contents of the buttons in the menu bar may change depending on which configuration type has been opened in the currently active tab. The Aggregation profile uses the standard buttons that are visible for all configurations, and these are described in Build View.
Session Tab
In the Session tab you can browse and select a Session UDR Type and configure the Storage selection settings.
...
Setting | Description | ||
---|---|---|---|
Session UDR Type | Click on the Browse... button and select the Session UDR Type. The Session UDR is defined in Ultra. For further information, see Session UDR Type. | ||
Storage | Select the type of storage for aggregation sessions. The available settings are File Couchbase Storage, Couchbase Elasticsearch Storage,Redis File Storage, Memory Only, Elasticsearch Redis Storage, and SQL Storage and Memory Only. File Storage and Memory Only can be used in batch and real-time workflows. Elasticsearch Storage and SQL Storage can only be used in batch workflows. Couchbase Storage and Redis Storage can only be used in real-time workflows. These storage types allow highly available systems with geographic redundancy. The session data that is replicated within the storage is available across workflows, EC Groups, and systems. This serves to minimize data loss in failover scenarios.
|
...
The Storage tab contains settings that are specific for File Storage, Couchbase Storage, Redis Storage, Elasticsearch Storage, and SQL Storage.
File Storage
...
Setting | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
Storage Host | Select a Storage Host from the drop-down list. For storage of aggregation sessions select either a specific EC Group or Automatic. If you select Automatic, the same EC Group that has been used by the running workflow will be applied. Alternatively, if the Aggregation Session Inspector is used, a storage host is selected automatically. Refer to Aggregation Session Inspector for further information on the Aggregation Session Inspector.
| ||||||||
Directory | Enter the directory on the Storage Host where you want the aggregation data to be stored.
If this field is greyed out with a stated directory, it means that the directory path has been hard-coded using the
| ||||||||
Partial File Count | In this field, you can enter the maximum number of partial files that you want to store. Consider the following: Startup: All the files are read at startup. It takes longer if there are many partial files. Transaction commitment: When the transactions are committed, many small files (large Partial File Count) increase performance. In a batch workflow, use this variable to tune performance.
| ||||||||
Max Cached Sessions | Enter the maximum number of sessions to keep in the memory cache. This is a performance-tuning parameter that determines the memory usage of the Aggregation agent. Set this value to be low enough so that there is still enough space for the cache in memory, but not too low, as this will cause performance to deteriorate. For further information see the section below, Performance Tuning with File Storage. | ||||||||
Enable Separate Storage Per Workflow | This option enables each workflow to have a separate session storage. Multiple workflows are allowed to run simultaneously using the same Aggregation profile. If this checkbox is selected, a workflow will never see a session from another workflow. |
...
Setting | Description | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Profile | Select an SQL profile. This profile is used to access the storage for aggregation sessions.
| ||||||||||||||||
Index Fields | Click the Add button to select the UDR type. | ||||||||||||||||
Table SQL Script | This text box will generate the SQL statements for the selected UDRs' table schema and indexes for Id, TxId. The schema will be generated based on the number of UDRs in the UDR Type Mapping table.
|
...
Elasticsearch Storage
...
For Elasticsearch storageStorage, you can modify the properties listed as shown above in the Advanced tab.
SQL Storage
...
For SQL storageStorage, you can modify the properties listed as shown above in the Advanced tab.
Note!
When using Couchbase or Redis aggregation storage, it is important to take note of the concept of locking mechanisms when configuring workflows. Locking mechanisms are of two types: Pessimistic and Optimistic.
Redis aggregation storage only has an Optimistic lock whereas Couchbase aggregation storage has both Optimistic and Pessimistic locks.
Pessimistic Lock
When a workflow thread is working on a session, it is considered to be fully locked. No other thread can work on that particular session. Once the first thread is finished, the lock is released and another thread can take the lock and work on the session.Optimistic Lock
Instead of acquiring a traditional lock for a session, a workflow thread obtains a CAS (Compare And Swap) for that session. The CAS serves as a kind of hash or fingerprint of the session data. When the consume block is done and the session is ready to be updated, an error occurs if the CAS no longer matches. In scenarios where multiple threads have made updates to the same session, only the changes from the first thread to complete its work are accepted. Any other thread(s) attempting to update will encounter failure and need to restart their work from the beginning. This process ensures that only changes from one workflow at a time can be committed, akin to the principles of pessimistic locking. It's essential to understand a key distinction: the consume and sessionInit blocks may be invoked multiple times due to the retry mechanism mentioned earlier. As a result, it's advisable to avoid using global variables within the aggregation APL. However, the udrRoute function can be safely utilized within these blocks since it is executed only when the Optimistic lock succeeds. If global variables are necessary, they can be relocated to an analysis agent and updated through the udrRoute function.
It is important to note that the threads specified in the locks above may live in multiple processes on multiple machines.