Spark Streaming NeteorkReceiver problems

2014-06-05 Thread zzzzzqf12345
hi, here is problem description, I write a custom networkreceiver to receive image data from camera. I had confirmed all the data received correctly. 1)when data received, only the networkreceiver node run at full speed, while other nodes keep idle, my spark cluster has 6 nodes. 2)And every image

rdd.map() can't pass parameters

2014-05-20 Thread zzzzzqf12345
I do some image matting on sparkstreaming, and I put the background images in a broadcast var , RDD[String,Qimage] => a sorted Array[Qiamge] val qingbg = broadcastbg.value.collect.sortWith((a,b) => a._1.toInt < b._1.toInt).map(data => data._2) When a image comes, I want to get its background imag

Re: streaming on hdfs can detected all new file, but the sum of all the rdd.count() not equals which had detected

2014-05-13 Thread zzzzzqf12345
thanks for reply~~ I had solved the problem and found the reason, because I used the Master node to upload files to hdfs, this action may take up a lot of Master's network resources. When I changed to use another computer none of the cluster to upload these files, it got the correct result. QingF

Re: streaming on hdfs can detected all new file, but the sum of all the rdd.count() not equals which had detected

2014-05-12 Thread zzzzzqf12345
d them "move" / "rename" them into the monitored directory. > That makes it "atomic". This is mentioned in the API docs of > fileStream<http://spark.apache.org/docs/0.9.1/api/streaming/index.html#org.apache.spark.streaming.StreamingContext>; > . > > TD

streaming on hdfs can detected all new file, but the sum of all the rdd.count() not equals which had detected

2014-05-11 Thread zzzzzqf12345
when I put 200 png files to Hdfs , I found sparkStreaming counld detect 200 files , but the sum of rdd.count() is less than 200, always between 130 and 170, I don't know why...Is this a Bug? PS: When I put 200 files in hdfs before streaming run , It get the correct count and right result. Here is