subject:"Re\: Programmatically launch several hundred Spark Streams in parallel"

Re: Programmatically launch several hundred Spark Streams in parallel

2015-07-24 Thread Brandon White

THanks. Sorry the last section was supposed be streams.par.foreach { nameAndStream => nameAndStream._2.foreachRDD { rdd => df = sqlContext.jsonRDD(rdd) df.insertInto(stream._1) } } ssc.start() On Fri, Jul 24, 2015 at 10:39 AM, Dean Wampler wrote: > You don't need the "par" (paral

Re: Programmatically launch several hundred Spark Streams in parallel

2015-07-24 Thread Dean Wampler

You don't need the "par" (parallel) versions of the Scala collections, actually, Recall that you are building a pipeline in the driver, but it doesn't start running cluster tasks until ssc.start() is called, at which point Spark will figure out the task parallelism. In fact, you might as well do th