Definition
...
Transaction Safety Overview
The Usage Engine service defines a transaction as a unit of data (For e.g, such as a file) being processed by a stream. A transaction is said to be complete if the file is processed by the stream without errors. Individual transactions are limited to the specified data source. Transactions are separated into two types of categories:
...
The data correction feature creates transactions by itself, and they are regarded as separate processes.
...
The behavior of transaction safety depends on the type of functions used in the stream and is only supported in the functions that have a state.
...
If the stream is restarted, a rollback is triggered to clean up incomplete transactions. The execution restarts from the last successfully processed transaction in the stream. For example, consider a stream that is processing 10 files. If the first 3 files are successfully processed and then an error occurs while processing the 4th file, the stream is aborted and Transaction Safety ensures that when the progress is saved and the stream is resumed, the processing continues from the last successful processing, i.e. from the 4th file.
The temporary state created during an ongoing transaction will be persisted for up to 40 days until it expires and is deleted from storage. This means that a stream needs to be restarted after a failed execution within 40 days in order to recover.
Transaction safety comes in three types: At-most-once, At-least-once, and Exactly-once.
typically uses transaction Usage Engine typically uses transaction safety of the type Exactly-once, to ensures that the data or file is processed only once during the execution of the stream. However, some functions are designed to include duplicates, and as such acts like transactions safety of the type At-least-once.
...
Note | ||
---|---|---|
| ||
This type is not used in in Usage Engine. |
At-least-once
The result is generated but duplicate results are possible due to multiple deliveries. The following functions use this method:
Exactly-once
The result is generated only once. No duplicates can be made.
...
- Amazon S3 collector
- SFTP collector
- Count
- Amazon S3 forwarder
- SFTP forwarder
- Interconnect collector
- Interconnect forwarder
- Data Aggregator
- Deduplicate
- Data Correction ( routed from Validate function only)
Transactions using Multiple Collectors
Streams using multiple collectors are handled in a way that ensures transaction safety. Each collector is handled in turn, determined by Usage Engine. Each collector adds to a queue, once the collector is ready to read the input.
...