Hey all I had just recently finished upgrading our production Storm environment to Storm 1.0.2 with some troubling results.
The same topology that took 60% CPU at idle in 0.10.0 now takes ~300% (8 core machines). This is a topology that runs on a single machine (to avoid any network) with 800+ active tasks. The topology should finish at about 1s, and up until now we had no problem meeting this goal. First thing I noticed was that the "topology.disruptor.wait.strategy" configuration was removed. Since our topology is oriented towards a low-latency target we were using the BlockingWaitStrategy for the disruptors. I thought that with the config property removed this was no longer an option until looking at the source and finding that I can still use the blocking strategy if I set topology.disruptor.wait.timeout.millis to 0. Doing this the disruptor automatically uses a blocking strategy. But event after setting this our topology still consumes way more CPU than previously. Is anyone else experiencing similar effects from upgrading to a >=1.0.0 version? I also tried changing topology.disruptor.batch.timeout.millis to values greater than its default of 1 with no significant effect on CPU usage. BTW, should this really default to 1 ms? Thanks, Re'em
