Re: Spark Streaming: Some issues (Could not compute split, block —— not found) and questions

2015-08-25 Thread Akhil Das
indow). > > But even when trying to do 5 minute windows, we have issues with "Could not > compute split, block —— not found". This is being run on a YARN cluster and > it seems like the executors are getting killed even though they should have > plenty of memory. > >

Spark Streaming: Some issues (Could not compute split, block —— not found) and questions

2015-08-19 Thread jlg
gates (there are a lot of repeated keys across this time frame, and we want to combine them all -- we do this using reduceByKeyAndWindow). But even when trying to do 5 minute windows, we have issues with "Could not compute split, block —— not found". This is being run on a YARN cluster an

Re: "Could not compute split, block not found" in Spark Streaming Simple Application

2015-04-13 Thread Saiph Kappa
Whether I use 1 or 2 machines, the results are the same... Here follows the results I got using 1 and 2 receivers with 2 machines: 2 machines, 1 receiver: sbt/sbt "run-main Benchmark 1 machine1 1000" 2>&1 | grep -i "Total delay\|record" 15/04/13 16:41:34 INFO JobScheduler: Total delay: 0.15

Re: "Could not compute split, block not found" in Spark Streaming Simple Application

2015-04-09 Thread Tathagata Das
Are you running # of receivers = # machines? TD On Thu, Apr 9, 2015 at 9:56 AM, Saiph Kappa wrote: > Sorry, I was getting those errors because my workload was not sustainable. > > However, I noticed that, by just running the spark-streaming-benchmark ( > https://github.com/tdas/spark-streaming-

Re: "Could not compute split, block not found" in Spark Streaming Simple Application

2015-04-09 Thread Saiph Kappa
Sorry, I was getting those errors because my workload was not sustainable. However, I noticed that, by just running the spark-streaming-benchmark ( https://github.com/tdas/spark-streaming-benchmark/blob/master/Benchmark.scala ), I get no difference on the execution time, number of processed record

Re: "Could not compute split, block not found" in Spark Streaming Simple Application

2015-03-27 Thread Tathagata Das
If it is deterministically reproducible, could you generate full DEBUG level logs, from the driver and the workers and give it to me? Basically I want to trace through what is happening to the block that is not being found. And can you tell what Cluster manager are you using? Spark Standalone, Meso

"Could not compute split, block not found" in Spark Streaming Simple Application

2015-03-27 Thread Saiph Kappa
Hi, I am just running this simple example with machineA: 1 master + 1 worker machineB: 1 worker « val ssc = new StreamingContext(sparkConf, Duration(1000)) val rawStreams = (1 to numStreams).map(_ =>ssc.rawSocketStream[String](host, port, StorageLevel.MEMORY_ONLY_SER)).toArray val uni

Re: Spark Streaming : Could not compute split, block not found

2014-10-09 Thread Tian Zhang
message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p16084.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To

Re: Spark Streaming : Could not compute split, block not found

2014-10-07 Thread Tian Zhang
://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p15899.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user

Re: Spark Streaming : Could not compute split, block not found

2014-09-02 Thread Tim Smith
I am seeing similar errors in my job's logs. TD - Are you still waiting for debug logs? If yes, can you please let me know how to generate debug logs? I am using Spark/Yarn and setting "NodeManager" logs to "DEBUG" level doesn't seem to produce anything but INFO logs. Thanks, Tim >Aaah sorry, I

Re: Spark Streaming : Could not compute split, block not found

2014-08-04 Thread Tathagata Das
essage in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11240.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Kanwaldeep
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11240.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-un

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Tathagata Das
001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11231.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Kanwaldeep
Not at all. Don't have any such code. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11231.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Tathagata Das
://apache-spark-user-list.1001560.n3.nabble.com/file/n11229/streaming.gz> > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11229.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Kanwaldeep
RDD using raw kafka data? Log File attached: streaming.gz <http://apache-spark-user-list.1001560.n3.nabble.com/file/n11229/streaming.gz> -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Tathagata Das
> > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186p11209.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Kanwaldeep
We are using Sparks 1.0. I'm using DStream operations such as map, filter and reduceByKeyAndWindow and doing a foreach operation on DStream. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-

Re: Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Tathagata Das
appen > every hour since last night. > > Any suggestions? > > Thanks > Kanwal > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-Could-not-compute-split-block-not-found-tp11186.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

Spark Streaming : Could not compute split, block not found

2014-08-01 Thread Kanwaldeep
Could-not-compute-split-block-not-found-tp11186.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Could not compute split, block not found

2014-07-01 Thread Bill Jay
Hi Tathagata, Yes. The input stream is from Kafka and my program reads the data, keeps all the data in memory, process the data, and generate the output. Bill On Mon, Jun 30, 2014 at 11:45 PM, Tathagata Das wrote: > Are you by any change using only memory in the storage level of the input > s

Re: Could not compute split, block not found

2014-07-01 Thread Bill Jay
Hi Tobias, Your explanation makes a lot of sense. Actually, I tried to use partial data on the same program yesterday. It has been up for around 24 hours and is still running correctly. Thanks! Bill On Mon, Jun 30, 2014 at 5:53 PM, Tobias Pfeiffer wrote: > Bill, > > let's say the processing t

Re: Could not compute split, block not found

2014-06-30 Thread Tathagata Das
Are you by any change using only memory in the storage level of the input streams? TD On Mon, Jun 30, 2014 at 5:53 PM, Tobias Pfeiffer wrote: > Bill, > > let's say the processing time is t' and the window size t. Spark does not > *require* t' < t. In fact, for *temporary* peaks in your streami

Re: Could not compute split, block not found

2014-06-30 Thread Tobias Pfeiffer
Bill, let's say the processing time is t' and the window size t. Spark does not *require* t' < t. In fact, for *temporary* peaks in your streaming data, I think the way Spark handles it is very nice, in particular since 1) it does not mix up the order in which items arrived in the stream, so items

Re: Could not compute split, block not found

2014-06-30 Thread Bill Jay
Tobias, Your suggestion is very helpful. I will definitely investigate it. Just curious. Suppose the batch size is t seconds. In practice, does Spark always require the program to finish processing the data of t seconds within t seconds' processing time? Can Spark begin to consume the new batch b

Re: Could not compute split, block not found

2014-06-29 Thread Bill Jay
Tobias, Thanks for your help. I think in my case, the batch size is 1 minute. However, it takes my program more than 1 minute to process 1 minute's data. I am not sure whether it is because the unprocessed data pile up. Do you have an suggestion on how to check it and solve it? Thanks! Bill On

Re: Could not compute split, block not found

2014-06-29 Thread Tobias Pfeiffer
Bill, were you able to process all information in time, or did maybe some unprocessed data pile up? I think when I saw this once, the reason seemed to be that I had received more data than would fit in memory, while waiting for processing, so old data was deleted. When it was time to process that

Could not compute split, block not found

2014-06-27 Thread Bill Jay
Hi, I am running a spark streaming job with 1 minute as the batch size. It ran around 84 minutes and was killed because of the exception with the following information: *java.lang.Exception: Could not compute split, block input-0-1403893740400 not found* Before it was killed, it was able to cor