hi,
here is problem description, I write a custom networkreceiver to receive
image data from camera. I had confirmed all the data received correctly.
1)when data received, only the networkreceiver node run at full speed, while
other nodes keep idle, my spark cluster has 6 nodes.
2)And every image
I do some image matting on sparkstreaming, and I put the background images in
a broadcast var ,
RDD[String,Qimage] => a sorted Array[Qiamge]
val qingbg = broadcastbg.value.collect.sortWith((a,b) => a._1.toInt <
b._1.toInt).map(data => data._2)
When a image comes, I want to get its background imag
thanks for reply~~
I had solved the problem and found the reason, because I used the Master
node to upload files to hdfs, this action may take up a lot of Master's
network resources. When I changed to use another computer none of the
cluster to upload these files, it got the correct result.
QingF
d them "move" / "rename" them into the monitored directory.
> That makes it "atomic". This is mentioned in the API docs of
> fileStream<http://spark.apache.org/docs/0.9.1/api/streaming/index.html#org.apache.spark.streaming.StreamingContext>;
> .
>
> TD
when I put 200 png files to Hdfs , I found sparkStreaming counld detect 200
files , but the sum of rdd.count() is less than 200, always between 130 and
170, I don't know why...Is this a Bug?
PS: When I put 200 files in hdfs before streaming run , It get the correct
count and right result.
Here is