Hi All, I have a simple stateless transformation using Dstreams (stuck with the old API for one of the Application). The pseudo code is rough like this
dstream.map().reduce().forEachRdd(rdd -> { rdd.collect(),forEach(); // Is this necessary ? Does execute fine but a bit slow }) I understand collect collects the results back to the driver but is that necessary? can I just do something like below? I believe I tried both and somehow the below code didn't output any results (It can be issues with my env. I am not entirely sure) but I just would like some clarification on .collect() since it seems to slow things down for me. dstream.map().reduce().forEachRdd(rdd -> { rdd.forEach(() -> {} ); // }) Thanks!