...
Step-by-Step Instructions
Configure the service model.
...
The service model describes your data, which KPIs (KPI stands for Key Performance Index) to generate, and how to calculate them. A JSON representation is used to describe the model, which includes the following top-level objects:
dimension
tree
metric
kpi
threshold
(optional)
Start with the
...
dimensions and tree objects. The dimensions describe the fields of your data
...
used for grouping and the tree the relation between them. The identifying fields in the input data are region and country. A region has one or more countries. The data type is sales. In the dimension object we specify each of our identifying fields as separate objects, with the datatype and field in the body.
Code Block "dimension": { "Region": { "sales": "region" }, "Country": { "sales": "country" } }, "tree": { "tree1": { "Region": { "Country": { } } } }
- Define the metrics using the amount field in the input data:
totalSales
- Define the metrics using the amount field in the input data:
...
- - For total sales, sum up the amount for each record by using
...
- the sum
...
- function on the
...
- expression expr, which contains
...
- the amount
...
- field.
...
avgSales
...
- - For average sales
...
- use the
avg
- use the
...
- function instead
...
- of
sum
. numSales
- of
...
- - To count the number of records, use the
...
- function isSet
...
- in the expression. This function evaluates to 1 if there is a value
...
- in amount
...
- or 0 if there is no value.
...
- Use the
...
- function sum
...
- to sum up the 1s and 0s.
- Define the KPIs. The expected output is the total sales, average sales, and number of sales per region and country in 60 second periods.
Use the
...
property
node
...
to describe where in the topology the KPI should be calculated and windowSize to set the period length. Use the
...
name of the metrics defined above in the
expr
property
...
Rw ui expands macro Rw expand title KPI "kpi": { "Region.TotalSales": { "node": [ "tree1", "Region" ], "windowSize": 60, "expr": "totalSales" }, "Region.AvgSales": { "node": [ "tree1", "Region" ], "windowSize": 60, "expr": "avgSales" }, "Region.NumberOfSales": { "node": [ "tree1", "Region" ], "windowSize": 60, "expr": "numSales" }, "Country.TotalSales": { "node": [ "tree1", "Region", "Country" ], "windowSize": 60, "expr": "totalSales" }, "Country.AvgSales": { "node": [ "tree1", "Region", "Country" ], "windowSize": 60, "expr": "avgSales" }, "Country.NumberOfSales": { "node": [ "tree1", "Region", "Country" ], "windowSize": 60, "expr": "numSales" } }
Combine all the objects above for a complete representation of the model. Below is an example containing all types.
Expand title Expand to see an example of a full model {
"dimension"
: {
"Region"
: {
"sales"
:
"region"
},
"Country"
: {
"sales"
:
"country"
}
},
"tree"
: {
"tree1"
: {
"Region"
: {
"Country"
: {
}
}
}
},
"metric"
: {
"totalSales"
: {
"fun"
:
"sum"
,
"expr"
: {
"sales"
:
"amount"
}
},
"avgSales"
: {
"fun"
:
"avg"
,
"expr"
: {
"sales"
:
"amount"
}
},
"numSales"
: {
"fun"
:
"sum"
,
"expr"
: {
"sales"
:
"isSet(amount)"
}
}
},
"kpi"
: {
"Region.TotalSales"
: {
"node"
: [
"tree1"
,
"Region"
],
"windowSize"
:
60
,
"expr"
:
"totalSales"
},
"Region.AvgSales"
: {
"node"
: [
"tree1"
,
"Region"
],
"windowSize"
:
60
,
"expr"
:
"avgSales"
},
"Region.NumberOfSales"
: {
"node"
: [
"tree1"
,
"Region"
],
"windowSize"
:
60
,
"expr"
:
"numSales"
},
"Country.TotalSales"
: {
"node"
: [
"tree1"
,
"Region"
,
"Country"
],
"windowSize"
:
60
,
"expr"
:
"totalSales"
},
"Country.AvgSales"
: {
"node"
: [
"tree1"
,
"Region"
,
"Country"
],
"windowSize"
:
60
,
"expr"
:
"avgSales"
},
"Country.NumberOfSales"
: {
"node"
: [
"tree1"
,
"Region"
,
"Country"
],
"windowSize"
:
60
,
"expr"
:
"numSales"
}
}
}
Open the Desktop and paste the service model into a KPI profile. Save the profile with the name
SalesModel
in the folderkpisales
.
...
Configure Kafka and Zookeper.
KPI Management reads and writes its data to and from Kafka. In order for this to work, you need to install and configure
...
both Kafka and Zookeeper.
...
More information about this can be found on the pages 4.3.2 Spark, kafka and zookeeper
...
as well as /wiki/spaces/MD83/pages/5966832. Kafka depends on Zookeeper (which is also included in the Kafka-installation folder) and you need to ensure that
...
Zookeeper is started first.
...
- Configure Spark. The Spark cluster will be running a so called app for doing the KPI calculations.
- Install and Configure Spark. The
...
- Spark cluster will be running a so called "app" for doing the KPI calculations. First you need to install Spark for Scala (
spark-3.5.0-bin-hadoop3-scala2.13
). More information about this can be found in the Spark documentation, https://spark.apache.org/docs/3.5.0/. For further information about properties related to
...
- Spark
...
- , see 4.3.2 Spark, kafka and zookeeper. Please note on the page that the spark-defaults.conf in the spark needs to contain the parameters mentioned on 4.2 Preparing and Creating Scripts for KPI Management for this to work.
The Spark slave node will have one worker that will be assigned four cores. The cores are split between the executors and the Spark driver. This means that we will have three executors running in parallel. The
...
- property SPARK_DEFAULT_PARALLELISM in
kpi_param.sh
is set to match this value.
The property
...
-
MZ_KPI_PROFILE_NAME
needs to match the folder- and configuration name of the KPI profile that was created in step 1. Start
...
up Zookeeper, Kafka and Spark.
Code Block title Set up environment variables $ export SPARK_HOME=/opt/spark-3.5.0-bin-hadoop3-scala2.13 $ export KAFKA_HOME=/opt/kafka_2.13-3.3.2 $ export PATH=$KAFKA_HOME/bin:$PATH:/opt/mz_kpiapp/bin
And while located in $KAFKA_HOME execute:
Code Block title Start Zookeeper and Kafka $ bin/zookeeper-server-start.sh config/zookeeper.properties & bin/kafka-server-start.sh config/server.properties
Run the following command to start spark:
Code Block title Start Spark start_master_worker.sh
Create the Kafka topics that
...
are required by the
...
KPI app. Each of the Spark executors needs to read from a separate Kafka partition so each of the
...
topics needs three partitions, i.e the number of partitions for each topic must be identical to the value of the property
...
SPARK_DEFAULT_PARALLELISM
inkpi_params.sh
.Code Block title Create Kafka Topics $ bin/kafka-topics.sh --create --topic kpi-input --bootstrap-server localhost:9092 --partitions 2 $ bin/kafka-topics.sh --create --topic kpi-output --bootstrap-server localhost:9092 --partitions 2 $ bin/kafka-topics.sh --create --topic kpi-error --bootstrap-server localhost:9092 --partitions 2
Create the real-time workflow. In this guide we will use Pulse agents to simulate sales data coming from three different sources, EMEA, AMERICAS, and APAC.
- Add three Pulse agents and an Analysis agent.
Workflow - Pulse Agents
Configure the Pulse agents as follows:- AMERICAS will send 1000 TPS - Set Time Unit to MILLISECONDS and Interval to 1
- EMEA will send 500 TPS - Set Time Unit to MILLISECONDS and Interval to 2
- APAC will send 250 TPS - Set Time Unit to MILLISECONDS and Interval to 4
To be able to identify the data, set the data to the region name.
Pulse agent configuration
The pulse agents only sends us a simple event containing the name of the region, the other data that will be used in the KPI calculations are generated in the connected Analysis agent.
- Add three Pulse agents and an Analysis agent.
...
The APL code below creates the input to KPI Management.
Scroll ignore scroll-viewport false scroll-pdf false scroll-office false scroll-chm true scroll-docbook true scroll-eclipsehelp true scroll-epub true scroll-html true falsetruefalsetruetruetruetruefalse APL code - Create a Kafka profile for the Kafka Producer agent. This agent will write to the
kpi-input
topic.
Kafka profile configuration - kpi-input - Add a KPI ClusterIn agent.
Workflow - KPI Cluster In agent
KPI Cluster In agent configuration
Configure it to use the KPI profile that you created in step 1. And add the Kafka Profile, that the agent will use to write on thekpi-input
topic. This will be read from by the KPI Management Spark application. The Analysis agent is
...
- added because the KPI Forwarding agent will send out
KafkaExceptionUDR
in case of errors in the Kafka communication (if Route On Error option is selected). This example does not cover handling of those errors.
Workflow - Kafka agents - Create a Kafka profile for the Kafka Collector agent. This agent will read from the
kpi-output
topic.
Kafka profile configuration - kpi-output
- added because the KPI Forwarding agent will send out
...
- Add a KPI Output Agent and configure it as follows. This agent will provide the KPI output:
Kafka Collector agent configuration - Add
- Add a KPI Output Agent and configure it as follows. This agent will provide the KPI output:
...
- another Analysis agent for debugging of the KPIs.
Workflow - KPI Cluster Out agent
Configure the agent to use the KPI profile that you created in step 1.
KPI Cluster Out agent configuration
- another Analysis agent for debugging of the KPIs.
...
- Add another Analysis agent for debugging of the KPIs.
Final workflow configuration Add the APL code below to the Analysis agent.
Scroll ignore scroll-viewport false scroll-pdf false scroll-office false scroll-chm true scroll-docbook true scroll-eclipsehelp true scroll-epub true scroll-html true falsetruefalsetruetruetruetruefalse APL Code Submit the Spark application to the cluster.
Code Block language bash title Submitting the Spark application $ submit.sh
- Add another Analysis agent for debugging of the KPIs.
Open the Spark UI at http://localhost:8080/. You should see that
spark-kpi-app1
is running.
Spark UIClick on the application and then Streaming at the top of the UI. You will se that the Input Rate is 0 records per second.
Spark UI - Streaming (no data)Open the workflow configuration in the Workflow Monitor. Enable debugging and select events for the KPI Cluster Out agent and the Analysis agent that produces the debug output.
Start the workflow.
Switch back to the Spark UI and refresh the page. The streaming statistics should indicate incoming data.
Spark UI - StreamingThe calculated KPIs will be displayed in the debug output in the Workflow Monitor.
Note title Note! It will take a minute before the output is displayed due to the configuration of the
windowSize
property in the service model.Scroll ignore scroll-viewport false scroll-pdf true scroll-office false scroll-chm true scroll-docbook true scroll-eclipsehelp true scroll-epub true scroll-html false Next:
Scroll pagebreak |
---|