Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This section describes the Duplicate UDR Detection profile and agent. The agent is a processing agent for batch workflow configurations. The Duplicate UDR Detection agent provides duplication control on incoming UDRs. Each new UDR is compared with the UDRs that are already stored, to evaluate if it is a duplicate.

...

Duplication comparison is not based on the content of a complete UDR but on the content of the fields selected by the user. When a UDR arrives, two values are calculated by the agent: 

  • Key from indexing field
  • Checksum based on all the fields to check and the indexing field

The key from the indexing field is used to find the right "container" in the cache. If an entry with the same checksum is found in the container, then this is classified as a duplicate.

Note
titleNote!

If the same file happens to be reprocessed, all UDRs will be considered as being duplicates, unless the cache is full, in which case a part of the cache will be cleared and the corresponding amount of UDRs will be considered as non-duplicates. If the file contains a considerable number of UDRs, the process of inserting all of them in ECS may be time-consuming.

Having a Duplicate Batch agent prior to the Duplicate UDR Detection agent will only make the problem worse. The Duplicate Batch agent will not detect that the batch is a duplicate until the end of the batch. At that point all UDRs have already passed the Duplicate UDR Detection agent and are inserted, as duplicates, into ECS. Since the Duplicate Batch agent will flag for a duplicate batch, the batch is removed from the stream forcing the Duplicate UDR Detection agent to also remove all UDRs from ECS.

Prerequisites

The reader of this information should be familiar with:

  • UDR structure and content

 The section contains the following subsections:

Child pages (Children Display)