Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Spark applications must be configured with a set of Kafka topics that are either shared between multiple applications or dedicated to specific applications. The assigned topics must be created before you submit an application to the Spark servicecluster. Before you can create the topics you must start Zookeeper and Kafka. 

Starting Services

The prerequisites for starting services are the following:

...

Spark, ZooKeeper and Kafka are installed and up and running.

...

Prerequisites:

...

...

Starting

...

.  Startup Spark cluster:

    $ start_master_workers.sh ...
 
.  Submit the app:
    $ submit.sh kpiapp ...

...

 
Spark UI 

...

Clusters

To start a cluster follow the steps:

  1. Start Zookeeper and Kafka
    To start Zookeeper, run the following:

    bin/zookeeper-server-start.sh config/zookeeper.properties

    To start Kafka, run:
    bin/kafka-server-start.sh config/server.properties

  2. Create Kafka topics and partitions using the scripts included in the Kafka installation.
    The names of the topics must correspond to the Spark application configuration.

Note

In order for the Spark KPI Application to work, the required number of partitions for each topic must be equal to the setting of the property spark.default.parallelism in the Spark application configuration.

...

Code Block
languagetext
$ ./bin/kafka-topics.sh --create --topic <input topic> --bootstrap-server \
localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas>
$ ./bin/kafka-topics.sh --create --topic <output topic> --bootstrap-server \
localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas>
$ ./bin/kafka-topics.sh --create --topic <alarm topic> --bootstrap-server \
localhost:9092 --partitions <number of partitions> --replication-factor <number of replicas>

...

Example - Creating Kafka Topics

./bin/kafka-topics.sh --create --topic kpi-output --partitions 6 --replication-factor 1 ./bin/kafka-topics.sh --create --topic kpi-input --partitions 6 --replication-factor 1 ./bin/kafka-topics.sh --create --topic kpi-alarm --partitions 6 --replication-factor 1

Code Block
./bin/kafka-topics.sh --create --topic kpi-output --partitions 6 --replication-factor 1
./bin/kafka-topics.sh --create --topic kpi-input --partitions 6 --replication-factor 1
./bin/kafka-topics.sh --create --topic kpi-alarm --partitions 6 --replication-factor 1

...

Example - Creating Kafka topics

...

Code Block
languagetext
themeEclipse
$ mzsh mzadmin/<password> kafka --service-key kafka1 \
--create --topic kpi-output --partitions 6 --replication-factor 1
$ mzsh mzadmin/<password> kafka --service-key kafka1 \
--create --topic kpi-input --partitions 6 --replication-factor 1
$ mzsh mzadmin/<password> kafka --service-key kafka1 \
--create --topic kpi-alarm --partitions 6 --replication-factor 1

...

, overriding retention settings

...

$ mzsh mzadmin/<password>
 
kafka
--
service-key kafka1 \ --
create --topic kpi-output
--partitions 6 --replication-factor 1 --config retention.ms=86400000 $ mzsh mzadmin/<password> kafka --service-key kafka1 \ --create --topic kpi-input
 --partitions 6 --replication-factor 1 --config retention.ms=86400000
$ mzsh mzadmin/<password> kafka --service-key kafka1 \ --create --topic kpi-alarm --partitions 6 --replication-factor 1 --config

retention.ms=86400000
Code Block
./bin/kafka-topics.sh
--create --topic kpi-input --partitions 6 --replication-factor 1 --config retention.ms=86400000 ./bin/kafka-topics.sh --create --topic kpi-alarm --partitions 6 --replication-factor 1 --config retention.ms=86400000
Code Block
Code Block
./bin/kafka-topics.sh --create --topic kpi-outputinput --partitions 6 --replication-factor 1 --config retention.ms=86400000
./bin/kafka-topics.sh --create --topic kpi-inputalarm --partitions 6 --replication-factor 1 --config retention.ms=86400000
./bin/kafka-topics.sh --create --topic kpi-alarm --partitions 6 --replication-factor 1 --config retention.ms=86400000

  1. Run the following command to start Spark:
    $ start_master_workers.sh ...
    To submit the app to the Spark cluster

    .  Submit the app:
        $ submit.sh kpiapp ...

  2. You can now confirm the status of the Spark cluster. Open a browser and go to http://<master host>:8080.

...

 
Spark UI