...
The Duplicate UDR cache is partitioned into containers by the key from the indexing field. The key from the indexing field is used to find the right "container" in the cache. If an entry with the same checksum is found in the container, then this is classified as a duplicate.
...
Note!
If a previously processed file is encountered again, all its UDRs will be treated as duplicates. However, should the cache reach a configured capacity when about to reprocess said file, the system initiates a pruning process, starting with the UDRs in the oldest container. This action frees up space in the cache, allowing new UDRs to be added while the corresponding amount cleared will be treated as non-duplicates. If the file contains a considerable number of UDRs, the process of inserting all of them in ECS may be time-consuming.
Having a Duplicate Batch agent
...
before the Duplicate UDR agent will only make the problem worse. The Duplicate Batch agent will not detect that the batch is a duplicate until the end of the batch. At that point, all UDRs have already passed the Duplicate UDR agent and are inserted, as duplicates, into ECS. Since the Duplicate Batch agent will flag for a duplicate batch, the batch is removed from the stream forcing the Duplicate UDR agent to also remove all UDRs from ECS.
Prerequisites
The reader of this information should be familiar with:
UDR structure and content
Subsections
This section contains the following subsections:
Scroll ignore | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| The section contains the following subsections:||||||||||||||||||