Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Due to the restriction on modifying rows in the streaming buffer, the GCP Bigquery agent does not modify the Data table at commit and rollback stage. Instead, the agent adopts a related design utilizing a Transaction ID, unique for each batch.

  1. The Data table must have a Transaction ID column.

  2. The Batch Status table must be created with a Transaction ID column and a Status column.

  3. At commit and rollback stage, the Batch Status is updated with a status code reflecting the current stage that can be used for auditing.

  4. Consumers of the loaded data are expected to always access that data through a view.

This view should join the two tables on Transaction ID where status = 0, for example:

Info
title

Example - View Joining Data Table and Batch Status Table

The following DDL query is used in the BigQuery Query Editor to create a view under the user_analytics Dataset with the table named view1.

Code Block
CREATE VIEW IF NOT EXISTS user_analytics.view1 AS
SELECT * FROM user_analytics.data_tbl1 AS t1
FULL JOIN user_analytics.batch_status_tbl1 AS t2
USING (id)  WHERE t2.status = 0;


...