Duplicate UDR SQL Storage Setup Guide
The contents of this page is meant as a general guide to list considerations that might impact the performance of your Duplicate UDR SQL storage setup. Note that each setup may vary and that these are not strict rules that users should apply for all their Duplicate UDR SQL storage setup.
File vs SQL Storage
Generally, there would be some performance overhead when choosing to use SQL storage instead of File storage, but there are benefits that an SQL storage could provide depending on your needs that makes the performance overhead acceptable. The purpose of this guide is to offer several recommendations that you may apply to minimize the performance overhead of your SQL storage setup.
Managing Latency
It is recommended that you deploy your database server in the same network as the EC server where workflows with Duplicate UDR agents will be running.
If you are unable to have both EC and the database server reside in the same network, minimize the network overhead between both servers as much as possible.
Optimizing Read And Write Operations
Performance may be better when "Enable Separate Storage Per Workflow" option is disabled.
SAP Hana
From our performance tests for SAP Hana SQL storage specifically, the use of columnar based table compared to row based tables show strong advantages in performance and disk space footprint.
There were unexpected issues during our tests when the memory capacity of the machine where SAP Hana was installed was too small. We experienced minimal issues when installing SAP Hana database with at least 64GB memory capacity.
Cache Size and Segmentation
The Indexing Field of incoming UDRs should preferably be either an increasing sequence number or a timestamp with good locality to preserve performance of workflow executions.
For large cache sizes, it may be a good idea to split them to multiple workflows in order to preserve performance.
For large cache sizes, splitting batches of incoming UDRs into multiple transactions may improve performance of workflow runs.
Minimizing the cache size may improve the performance of the duplicate checks by the Duplicate UDR agents and the speed of Search and Delete operations using the Duplicate UDR Inspector.