All, Does anyone meet memory leak issue with spark streaming and spark sql in spark 1.5.1? I can see the memory is increasing all the time when running this simple sample:
val sc = new SparkContext(conf) val sqlContext = new HiveContext(sc) import sqlContext.implicits._ val ssc = new StreamingContext(sc, Seconds(1)) val s1 = ssc.socketTextStream("localhost", 9999).map(x => (x,1)).reduceByKey((x : Int, y : Int) => x + y) s1.print s1.foreachRDD(rdd => { rdd.foreach(_ => Unit) sqlContext.createDataFrame(rdd).registerTempTable("A") sqlContext.sql("""select * from A""").show(1) }) After dump the the java heap, I can see there is about 22K entries in SQLListener._stageIdToStageMetrics after 2 hour running (other maps in this SQLListener has about 1K entries), is this a leak in SQLListener? Thanks! Terry