Apache storm starter

2 minute read

run WordCountTopology in Local Mode and (Local) Cluster Mode version: 1.2.2 env: vms centos7 example:https://github.com/apache/storm/blob/v1.2.2/examples/storm-starter/src/jvm/org/apache/storm/starter/WordCountTopology.java


Download storm

make sure you have the storm-starter code available on your machine. If you have already downloaded storm from http://storm.apache.org/downloads.html then you will find the storm-starter code under your apache-storm-<version>/examples/ directory. Alternatively, Git/GitHub beginners may want to use the following command to download the latest storm-starter code and change to the new directory that contains the downloaded code, but make sure you have the same version of storm running.

first step

cd /apache-storm-1.2.2/examples/storm-starter
mvn clean install -DskipTests=true

Local Mode

CMD: mvn compile exec:java -Dstorm.topology=org.apache.storm.starter.WordCountTopology

fixed: No module named storm and AttributeError: ‘module’ object has no attribute ‘BasicBolt’

reason: the python script which SplitSentence uses cannot find dependency, and initially I thought the “import storm” simply using ‘pip install storm’, but it turns out not the one it’s using, finally solved by: Download https://github.com/apache/storm/blob/master/storm-multilang/python/src/main/resources/resources/storm.py put it into any folder you want, for example: /apache-storm-1.2.2/examples/storm-starter/multilang/resources/ , you may find another storm.py in /apache-storm-1.2.2/bin/storm.py, but don’t use that one, I’m not sure the purpose of it, it doesn’t contain BasicBolt

then change the main function to:

 public static void main(String[] args) throws Exception {

    SplitSentence pythonSplit = new SplitSentence();
    Map env = new HashMap();
    env.put("PYTHONPATH", "/apache-storm-1.2.2/examples/storm-starter/multilang/resources/");

    TopologyBuilder builder = new TopologyBuilder();

    builder.setSpout("spout", new RandomSentenceSpout(), 5);

    builder.setBolt("split",pythonSplit, 8).shuffleGrouping("spout");
    builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));

    Config conf = new Config();

    if (args != null && args.length > 0) {

      StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
    else {

      LocalCluster cluster = new LocalCluster();
      cluster.submitTopology("word-count", conf, builder.createTopology());



basically we just need to set python path point to the folder of ‘storm.py’ I’ve also changed Thread.sleep(10000) 10sec to 10mins, because I find the localcluster auto terminated before spout and bolt starting, I guess that’s because I tested it using virtualbox, so the cluster startup very slow.

(Local) Cluster Mode

config: /apache-storm-1.2.2/conf/storm.yaml

storm.local.dir: "storm-local"
storm.log4j2.conf.dir: "log4j2"
 - "localhost"
storm.zookeeper.port: 2181
storm.zookeeper.root: "/storm"
storm.zookeeper.session.timeout: 20000
storm.zookeeper.connection.timeout: 15000
storm.zookeeper.retry.times: 5
storm.zookeeper.retry.interval: 1000
storm.zookeeper.retry.intervalceiling.millis: 30000
storm.zookeeper.auth.user: null
storm.zookeeper.auth.password: null

nimbus.seeds: ["localhost"]
ui.port: 8080

then run::

storm dev-zookeeper 
storm nimbus
storm supervisor
storm ui

screenshot open storm ui portal and click on the port number to debug, obviously the full link of it isn’t correct, so I just extract the relative path and check it manually check the log content vim /apache-storm-1.2.2/logs/workers-artifacts/wordcount-1-1558059690/6702/worker.log then I found an exception ::Error on initialization of server mk-worker

fixed: Error on initialization of server mk-worker

folder permission error for storm.local.dir: “storm-local” solved by chmod 777 /apache-storm-1.2.2/storm-local/