4.7.5 Removing Runtime Data
After running a KPI Management system, you may have temporary runtime data at different processing stages in Kafka and in Spark. In some cases you may to remove this data, for example:
- To process
KDR
UDRs with timestamps that belong to a closed period. These will normally be discarded. - To perform multiple test runs that should generetate identical results.
- To immediately remove data from Kafka topics that cannot be processed, for example, after restarting the Spark cluster.
Note!
As prerequisite, the scripts must be prepared according to 4.2 Preparing and Creating Scripts for KPI Management
The following steps are required to remove all existing runtime data:
- Stop the KPI Management workflows.
Stop the Spark cluster.
$ stop.sh
Stop the
kpi-app
and run:clean rm -fr $SPARK_APP_CHECKPOINT_DIR
Remove data stored by Kafka, Zookeeper, and Spark.
Note!
Calculated KPIs for the current periods will be lost.
$ rm -rf $MZ_HOME/storage/kafka/* $ rm -rf $MZ_HOME/storage/zookeeper/* $ rm -rf <checkpoint directory>/*
The default checkpoint directory is mz_kpi
app/data/spark-checkpoint-dir/<spark application name>
.Start Kafka and Zookeeper.
bin/zookeeper-server-start.sh config/zookeeper.properties & bin/kafka-server-start.sh config/server.properties
Start Spark
$ start_master_workers.sh
- Recreate the Kafka topics that are specified in the service configuration. For further information, see 4.3.2 Spark, kafka and zookeeper.
- Run
submit.sh kpiapp
to submit the Spark application(s). See 4.7.2 Submitting a Spark Application. - Start the KPI Management workflows.