Sorry this is meant to go to spark users. Ignore this thread. On Fri, Jan 23, 2015 at 2:25 PM, Chen Song <chen.song...@gmail.com> wrote:
> I am running into some problems with Spark Streaming when reading from > Kafka.I used Spark 1.2.0 built on CDH5. > The example is based on: > > https://github.com/apache/spark/blob/master/examples/scala-2.10/src/main/scala/org/apache/spark/examples/streaming/KafkaWordCount.scala > * It works with default implementation. > val topicMap = topics.split(",").map((_,numThreads.toInt)).toMap > val lines = KafkaUtils.createStream(ssc, zkQuorum, group, > topicMap).map(_._2) > > * However, when I changed it to parallel receiving, like shown below > > val topicMap = topics.split(",").map((_, 1)).toMap > val parallelInputs = (1 to numThreads.toInt) map { _ => KafkaUtils > .createStream(ssc, zkQuorum, group, topicMap) > > } > > ssc.union(parallelInputs) > After the change, the job stage just hang there and never finish. It looks > like no action is triggered on the streaming job. When I check the > "Streaming" tab, it show messages below: > Batch Processing Statistics > > No statistics have been generated yet. > > > Am I doing anything wrong on the parallel receiving part? > -- > Chen Song > > -- Chen Song