THanks. Sorry the last section was supposed be
streams.par.foreach { nameAndStream =>
nameAndStream._2.foreachRDD { rdd =>
df = sqlContext.jsonRDD(rdd)
df.insertInto(stream._1)
}
}
ssc.start()
On Fri, Jul 24, 2015 at 10:39 AM, Dean Wampler
wrote:
> You don't need the "par" (paral
You don't need the "par" (parallel) versions of the Scala collections,
actually, Recall that you are building a pipeline in the driver, but it
doesn't start running cluster tasks until ssc.start() is called, at which
point Spark will figure out the task parallelism. In fact, you might as
well do th