Re: Kafka Spark structured streaming latency benchmark.

2017-01-02 Thread Prashant Sharma
This issue was fixed in https://issues.apache.org/jira/browse/SPARK-18991. --Prashant On Tue, Dec 20, 2016 at 6:16 PM, Prashant Sharma wrote: > Hi Shixiong, > > Thanks for taking a look, I am trying to run and see if making > ContextCleaner run more frequently and/or making it non blocking wil

Re: Kafka Spark structured streaming latency benchmark.

2016-12-20 Thread Prashant Sharma
Hi Shixiong, Thanks for taking a look, I am trying to run and see if making ContextCleaner run more frequently and/or making it non blocking will help. --Prashant On Tue, Dec 20, 2016 at 4:05 AM, Shixiong(Ryan) Zhu wrote: > Hey Prashant. Thanks for your codes. I did some investigation and it

Re: Kafka Spark structured streaming latency benchmark.

2016-12-20 Thread Jacek Laskowski
Hi, (what a timing. Just reviewed CC yesterday!) In ALS they trigger cleaning up shufflemapstages themselves so if I understood the issue the streaming part could do it too. Jacek On 19 Dec 2016 11:35 p.m., "Shixiong(Ryan) Zhu" wrote: > Hey Prashant. Thanks for your codes. I did some investig

Re: Kafka Spark structured streaming latency benchmark.

2016-12-19 Thread Shixiong(Ryan) Zhu
Hey Prashant. Thanks for your codes. I did some investigation and it turned out that ContextCleaner is too slow and its "referenceQueue" keeps growing. My hunch is cleaning broadcast is very slow since it's a blocking call. On Mon, Dec 19, 2016 at 12:50 PM, Shixiong(Ryan) Zhu < shixi...@databricks

Re: Kafka Spark structured streaming latency benchmark.

2016-12-19 Thread Shixiong(Ryan) Zhu
Hey, Prashant. Could you track the GC root of byte arrays in the heap? On Sat, Dec 17, 2016 at 10:04 PM, Prashant Sharma wrote: > Furthermore, I ran the same thing with 26 GB as the memory, which would > mean 1.3GB per thread of memory. My jmap >

Re: Kafka Spark structured streaming latency benchmark.

2016-12-17 Thread Prashant Sharma
Furthermore, I ran the same thing with 26 GB as the memory, which would mean 1.3GB per thread of memory. My jmap results and jstat results