Today, to be highly competitive, businesses need to be really efficient and quick in decision making. In order to do this, they need to process vast amount of data crucial for making the right decisions at the right time. It is important to collect high-quality and large amount of accurate data to create relevant results and this is where data aggregation comes into picture. Data aggregation is a process of collecting and summarizing data for statistical analysis.
In this example, we build a simple stream that summarizes the number of games downloaded by a user for each month and writes it in a log.
Functions used in this stream and their purpose:
Count - Counter that triggers the stream to run for as many times as specified in the configuration.
Simulate Data (Script) - Simulates data with a daily record of number of games downloaded by a user. This step substitutes real data input.
Data Aggregator - Summarizes monthly number of games downloaded.
Monthly Record (Field Organizer) - Filters the required data for business processing.
Log - Writes the received summary event in a log. This step substitutes data being sent for billing.