Re: spark streaming and the spark shell

2014-11-19 Thread Tian Zhang
I am hitting the same issue, i.e., after running for some time, if spark streaming job lost or timeout kafka connection, it will just start to return empty RDD's .. Is there a timeline for when this issue will be fixed so that I can plan accordingly? Thanks. Tian -- View this message in conte

Re: spark streaming and the spark shell

2014-03-28 Thread Diana Carroll
Thanks, Tagatha. This and your other reply on awaitTermination are very helpful. Diana On Thu, Mar 27, 2014 at 4:40 PM, Tathagata Das wrote: > Very good questions! Responses inline. > > TD > > On Thu, Mar 27, 2014 at 8:02 AM, Diana Carroll > wrote: > > I'm working with spark streaming using s

Re: spark streaming and the spark shell

2014-03-27 Thread Evgeny Shishkin
On 28 Mar 2014, at 01:37, Tathagata Das wrote: > I see! As I said in the other thread, no one reported these issues until now! > A good and not-too-hard fix is to add the functionality of the limiting the > data rate that the receivers receives at. I have opened a JIRA. > Yes, actually you

Re: spark streaming and the spark shell

2014-03-27 Thread Tathagata Das
I see! As I said in the other thread, no one reported these issues until now! A good and not-too-hard fix is to add the functionality of the limiting the data rate that the receivers receives at. I have opened a JIRA. TD On Thu, Mar 27, 2014 at 3:28 PM, Evgeny Shishkin wrote: > > On 28 Mar 2014

Re: spark streaming and the spark shell

2014-03-27 Thread Evgeny Shishkin
On 28 Mar 2014, at 01:13, Tathagata Das wrote: > Seems like the configuration of the Spark worker is not right. Either the > worker has not been given enough memory or the allocation of the memory to > the RDD storage needs to be fixed. If configured correctly, the Spark workers > should not

Re: spark streaming and the spark shell

2014-03-27 Thread Tathagata Das
Seems like the configuration of the Spark worker is not right. Either the worker has not been given enough memory or the allocation of the memory to the RDD storage needs to be fixed. If configured correctly, the Spark workers should not get OOMs. On Thu, Mar 27, 2014 at 2:52 PM, Evgeny Shishkin

Re: spark streaming and the spark shell

2014-03-27 Thread Evgeny Shishkin
> >> 2. I notice that once I start ssc.start(), my stream starts processing and >> continues indefinitely...even if I close the socket on the server end (I'm >> using unix command "nc" to mimic a server as explained in the streaming >> programming guide .) Can I tell my stream to detect if it's

Re: spark streaming and the spark shell

2014-03-27 Thread Tathagata Das
Very good questions! Responses inline. TD On Thu, Mar 27, 2014 at 8:02 AM, Diana Carroll wrote: > I'm working with spark streaming using spark-shell, and hoping folks could > answer a few questions I have. > > I'm doing WordCount on a socket stream: > > import org.apache.spark.streaming.Streamin