Apache storm starter
run WordCountTopology in Local Mode and (Local) Cluster Mode version: 1.2.2 env: vms centos7 example:https://github.com/apache/storm/blob/v1.2.2/examples/storm-starter/src/jvm/org/apache/storm/starter/WordCountTopology.java
Prerequisites
Download storm
make sure you have the storm-starter code available on your machine. If you have already downloaded storm from http://storm.apache.org/downloads.html then you will find the storm-starter code under your apache-storm-<version>/examples/ directory. Alternatively, Git/GitHub beginners may want to use the following command to download the latest storm-starter code and change to the new directory that contains the downloaded code, but make sure you have the same version of storm running.
first step
cd /apache-storm-1.2.2/examples/storm-starter
mvn clean install -DskipTests=true
Local Mode
CMD:
mvn compile exec:java -Dstorm.topology=org.apache.storm.starter.WordCountTopology
fixed: No module named storm and AttributeError: ‘module’ object has no attribute ‘BasicBolt’
reason: the python script which SplitSentence uses cannot find dependency, and initially I thought the “import storm” simply using ‘pip install storm’, but it turns out not the one it’s using, finally solved by: Download https://github.com/apache/storm/blob/master/storm-multilang/python/src/main/resources/resources/storm.py put it into any folder you want, for example: /apache-storm-1.2.2/examples/storm-starter/multilang/resources/ , you may find another storm.py in /apache-storm-1.2.2/bin/storm.py, but don’t use that one, I’m not sure the purpose of it, it doesn’t contain BasicBolt
then change the main function to:
public static void main(String[] args) throws Exception {
SplitSentence pythonSplit = new SplitSentence();
Map env = new HashMap();
env.put("PYTHONPATH", "/apache-storm-1.2.2/examples/storm-starter/multilang/resources/");
pythonSplit.setEnv(env);
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("spout", new RandomSentenceSpout(), 5);
builder.setBolt("split",pythonSplit, 8).shuffleGrouping("spout");
builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word"));
Config conf = new Config();
conf.setDebug(true);
if (args != null && args.length > 0) {
conf.setNumWorkers(3);
StormSubmitter.submitTopologyWithProgressBar(args[0], conf, builder.createTopology());
}
else {
conf.setMaxTaskParallelism(3);
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("word-count", conf, builder.createTopology());
Thread.sleep(600000);
cluster.shutdown();
}
}
basically we just need to set python path point to the folder of ‘storm.py’ I’ve also changed Thread.sleep(10000) 10sec to 10mins, because I find the localcluster auto terminated before spout and bolt starting, I guess that’s because I tested it using virtualbox, so the cluster startup very slow.
(Local) Cluster Mode
config: /apache-storm-1.2.2/conf/storm.yaml
storm.local.dir: "storm-local"
storm.log4j2.conf.dir: "log4j2"
storm.zookeeper.servers:
- "localhost"
storm.zookeeper.port: 2181
storm.zookeeper.root: "/storm"
storm.zookeeper.session.timeout: 20000
storm.zookeeper.connection.timeout: 15000
storm.zookeeper.retry.times: 5
storm.zookeeper.retry.interval: 1000
storm.zookeeper.retry.intervalceiling.millis: 30000
storm.zookeeper.auth.user: null
storm.zookeeper.auth.password: null
nimbus.seeds: ["localhost"]
ui.host: 0.0.0.0
ui.port: 8080
then run::
storm dev-zookeeper
storm nimbus
storm supervisor
storm ui
screenshot open storm ui portal and click on the port number to debug, obviously the full link of it isn’t correct, so I just extract the relative path and check it manually check the log content vim /apache-storm-1.2.2/logs/workers-artifacts/wordcount-1-1558059690/6702/worker.log then I found an exception ::Error on initialization of server mk-worker
fixed: Error on initialization of server mk-worker
folder permission error for storm.local.dir: “storm-local”
solved by
chmod 777 /apache-storm-1.2.2/storm-local/
Comments