Long running Spark Streaming Job increasing executing time per batch

2014-06-19 Thread Skogberg, Fredrik
Hi, I’ve been trying for the last couple of weeks to create a Spark Streaming Job which joins two streams using a common id, and then have another run queries on the output of the joined streams. I’m using Spark 0.9.0, and the ooyala job server to share the context between jobs. The flow is

Re: Long running Spark Streaming Job increasing executing time per batch

2014-06-19 Thread Skogberg, Fredrik
Hi TD, >Thats quite odd. Yes, with checkpoint the lineage does not increase. Can you >tell which stage is the >processing of each batch is causing the increase in >the processing time? I haven’t been able to determine exactly what stage that is causing the increase in processing time. Any poi