subject:"Resource Leak in Spark Streaming"

Re: Resource Leak in Spark Streaming

2017-02-07 Thread Nipun Arora

I located the issue: Having the following seems to be necessary in the pool object to make it serialized: *private transient *ConcurrentLinkedQueue> *pool*; However this means open connections cannot be re-used in subsequent micro-batches, as transient objects are not persistent. How can we go a

Re: Resource Leak in Spark Streaming

2017-02-07 Thread Nipun Arora

Ryan, Apologies for coming back so late, I created a github repo to resolve this problem. On trying your solution for making the pool a Singleton, I get a null pointer exception in the worker. Do you have any other suggestions, or a simpler mechanism for handling this? I have put all the current

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Nipun Arora

Just to be clear the pool object creation happens in the driver code, and not in any anonymous function which should be executed in the executor. On Tue, Jan 31, 2017 at 10:21 PM Nipun Arora wrote: > Thanks for the suggestion Ryan, I will convert it to singleton and see if > it solves the proble

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Shixiong(Ryan) Zhu

The KafkaProducerPool instance is created in the driver. Right? What's I was saying is when a Spark job runs, it will serialize KafkaProducerPool and create a new instance in the executor side. You can use the singleton pattern to make sure one JVM process has only one KafkaProducerPool instance.

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Nipun Arora

It's a producer pool, the borrow object takes an existing kafka producer object if it is free, or creates one if all are being used. Shouldn't we re-use kafka producer objects for writing to Kafka. @ryan- can you suggest a good solution for writing a dstream to kafka which can be used in productio

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Shixiong(Ryan) Zhu

Looks like you create KafkaProducerPool in the driver. So when the task is running in the executor, it will always see an new empty KafkaProducerPool and create KafkaProducers. But nobody closes these KafkaProducers. On Tue, Jan 31, 2017 at 3:02 PM, Nipun Arora wrote: > > Sorry for not writing t

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Nipun Arora

Sorry for not writing the patch number, it's spark 1.6.1. The relevant code is here inline. Please have a look and let me know if there is a resource leak. Please also let me know if you need any more details. Thanks Nipun The JavaRDDKafkaWriter code is here inline: import org.apache.spark.api

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Shixiong(Ryan) Zhu

Please also include the patch version, such as 1.6.0, 1.6.1. Could you also post the JAVARDDKafkaWriter codes. It's also possible that it leaks resources. On Tue, Jan 31, 2017 at 2:12 PM, Nipun Arora wrote: > It is spark 1.6 > > Thanks > Nipun > > On Tue, Jan 31, 2017 at 1:45 PM Shixiong(Ryan) Z

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Nipun Arora

It is spark 1.6 Thanks Nipun On Tue, Jan 31, 2017 at 1:45 PM Shixiong(Ryan) Zhu wrote: > Could you provide your Spark version please? > > On Tue, Jan 31, 2017 at 10:37 AM, Nipun Arora > wrote: > > Hi, > > I get a resource leak, where the number of file descriptors in spark > streaming keeps in

Re: Resource Leak in Spark Streaming

2017-01-31 Thread Shixiong(Ryan) Zhu

Could you provide your Spark version please? On Tue, Jan 31, 2017 at 10:37 AM, Nipun Arora wrote: > Hi, > > I get a resource leak, where the number of file descriptors in spark > streaming keeps increasing. We end up with a "too many file open" error > eventually through an exception caused in:

Resource Leak in Spark Streaming

2017-01-31 Thread Nipun Arora

Hi, I get a resource leak, where the number of file descriptors in spark streaming keeps increasing. We end up with a "too many file open" error eventually through an exception caused in: JAVARDDKafkaWriter, which is writing a spark JavaDStream The exception is attached inline. Any help will be

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Re: Resource Leak in Spark Streaming

Resource Leak in Spark Streaming

11 matches

Site Navigation

Mail list logo

Footer information