Hi Mich, batch interval is 10 seconds, and I don't use sliding window. Typical message count per batch is ~100k.
-- John Simon On Fri, Jun 10, 2016 at 10:31 AM, Mich Talebzadeh <mich.talebza...@gmail.com > wrote: > Hi John, > > I did not notice anything unusual in your env variables. > > However, what are the batch interval, the windowsLength and SlindingWindow > interval. > > Also how many messages are sent by Kafka in a typical batch interval? > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > > On 10 June 2016 at 18:21, john.simon <john.si...@tapjoy.com> wrote: > >> Hi all, >> >> I'm running Spark Streaming with Kafka Direct Stream, but after >> running a couple of days, the batch processing time almost doubles. >> I didn't find any slowdown on JVM GC logs, but I did find that Spark >> broadcast variable reading time increasing. >> Initially it takes less than 10ms, but after 3 days it takes more than >> 60ms. It's really puzzling since I don't use broadcast variables at >> all. >> >> My application needs to run 24/7, so I hope there's something I'm >> missing to correct this behavior. >> >> FYI, we're running on AWS EMR with Spark version 1.6.1, in YARN client >> mode. >> Attached spark application environment settings file. >> >> -- >> John Simon >> >> *environment.txt* (7K) Download Attachment >> <http://apache-spark-user-list.1001560.n3.nabble.com/attachment/27138/0/environment.txt> >> >> ------------------------------ >> View this message in context: Long Running Spark Streaming getting slower >> <http://apache-spark-user-list.1001560.n3.nabble.com/Long-Running-Spark-Streaming-getting-slower-tp27138.html> >> Sent from the Apache Spark User List mailing list archive >> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >> > >