Spark applications must be configured with a set of Kafka topics that are either shared between multiple applications or dedicated to specific applications. The assigned topics must be created before you submit an application to the Spark servicecluster. Before you can create the topics you must start the Zookeeper and Kafka and Zookeeper services.
Starting Services
The prerequisites for starting services are the following:
...
Prerequisites:
Prepare scripts according to Preparing and Creating Scripts for KPI Management
...
...
Starting
...
. Startup Spark cluster:
$ start_master_workers.sh ...
. Submit the app:
$ submit.sh kpiapp ...
You can now confirm the status of the Spark cluster. Open a browser and go to http://<master host>:8080
.
Spark UI
Creating Kafka Topics
Clusters
To start a cluster follow the steps:
Start Zookeeper and Kafka
To start Zookeeper, run the following:bin/zookeeper-server-start.sh config/zookeeper.properties
To start Kafka, run:bin/kafka-server-start.sh config/server.properties
Create Kafka topics and partitions using
...
the scripts included in the Kafka installation.
The names of the topics must correspond to the Spark application configuration.
Note |
---|
In order for the Spark service KPI Application to work, the required number of partitions for each topic must be equal to the setting of the property |
Use a replication factor that is greater than one (1) to make sure that data is replicated between Kafka brokers. This decreases the risk of losing data in case of issues with the brokers. This is how to create topics, assuming the current working directory is the Kafka software folder:
Code Block | ||||
---|---|---|---|---|
| ||||
$ mzsh mzadmin/<password> kafka./bin/kafka-topics.sh --servicecreate --keytopic <key><input \topic> --create bootstrap--topicserver <output\ topic>localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas> $ mzsh mzadmin/<password> kafka --service-key \ <key> --create --topic <input topic>./bin/kafka-topics.sh --create --topic <output topic> --bootstrap-server \ localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas> $ mzsh mzadmin/<password> kafka --service-key <key> \ ./bin/kafka-topics.sh --create --topic <alarm topic> --bootstrap-server \ localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas> | ||||
| title |
Example - Creating Kafka
...
Topics
Code Block | ||||
---|---|---|---|---|
| ||||
Info | ||||
Example - Creating Kafka topics, overriding retention settings
Code Block |
---|
$ mzsh mzadmin/<password> kafka --service-key kafka1 \
./bin/kafka-topics.sh --create --topic kpi-output --partitions 6 --replication-factor 1 --config retention.ms=86400000$ mzsh mzadmin/<password> kafka --service-key kafka1 \ ./bin/kafka-topics.sh --create --topic kpi-input --partitions 6 --replication-factor 1 --config retention.ms=86400000$ mzsh mzadmin/<password> kafka --service-key kafka1 \ ./bin/kafka-topics.sh --create --topic kpi-alarm --partitions 6 --replication-factor 1 --config retention.ms=86400000 |
Run the following command to start Spark:
$ start_master_workers.sh ...
To submit the app to the Spark cluster. Submit the app:
$ submit.sh kpiapp ...
You can now confirm the status of the Spark cluster. Open a browser and go to
http://<master host>:8080
.
...
Spark UI