Thanks for your response! I found that too, and it does the trick! Here's
refined code:
val inputDStream = ...
val keyedDStream = inputDStream.map(...) // use sensorId as key
val partitionedDStream = keyedDstream.transform(rdd => rdd.partitionBy(new
MyPartitioner(...)))
val stateDStream = part
Just a guess.
updateStateByKey has overloaded variants with partitioner as parameter. Can it
help?
-Original Message-
From: qihong [mailto:qc...@pivotal.io]
Sent: Tuesday, September 09, 2014 9:13 PM
To: u...@spark.incubator.apache.org
Subject: Re: how to setup steady state stream
Thanks for your response. I do have something like:
val inputDStream = ...
val keyedDStream = inputDStream.map(...) // use sensorId as key
val partitionedDStream = keyedDstream.transform(rdd => rdd.partitionBy(new
MyPartitioner(...)))
val stateDStream = partitionedDStream.updateStateByKey[...](ud
Using your own partitioner didn't work?
e.g.
YourRDD.partitionBy(new HashPartitioner(your number))
xj @ Tokyo
On Wed, Sep 10, 2014 at 12:03 PM, qihong wrote:
> I'm working on a DStream application. The input are sensors' measurements,
> the data format is
>
> There are 10 thousands sensors,