4.7.5 Removing Runtime Data

After running a KPI Management system, you may have temporary runtime data at different processing stages in Kafka and in Spark. In some cases you may to remove this data, for example: 

  • To process KDR UDRs with timestamps that belong to a closed period. These will normally be discarded. 
  • To perform multiple test runs that should generetate identical results.
  • To immediately remove data from Kafka topics that cannot be processed, for example, after restarting the Spark cluster.


Note!

As prerequisite, the scripts must be prepared according to 4.2 Preparing and Creating Scripts for KPI Management

The following steps are required to remove all existing runtime data:

  1. Stop the KPI Management workflows.

  2. Stop the Spark cluster.

    $ stop.sh
  3. Stop the kpi-app and run:

    clean rm -fr $SPARK_APP_CHECKPOINT_DIR
  4. Remove data stored by Kafka, Zookeeper, and Spark.

    Note!

    Calculated KPIs for the current periods will be lost.

    $ rm -rf $MZ_HOME/storage/kafka/*
    $ rm -rf $MZ_HOME/storage/zookeeper/* 
    $ rm -rf <checkpoint directory>/*

    The default checkpoint directory is mz_kpiapp/data/spark-checkpoint-dir/<spark application name>.


  5. Start Kafka and Zookeeper.

    bin/zookeeper-server-start.sh config/zookeeper.properties & bin/kafka-server-start.sh config/server.properties
  6. Start Spark

    $ start_master_workers.sh
  7. Recreate the Kafka topics that are specified in the service configuration. For further information, see 4.3.2 Spark, kafka and zookeeper.

  8. Run submit.sh kpiapp to submit the Spark application(s). See 4.7.2 Submitting a Spark Application.

  9. Start the KPI Management workflows.