Hi, I'm running a Spark Streaming job that pulls data from Kafka (using the direct approach method - without receiver) and pushes it into elasticsearch. The job is running fine but I was suprised once I opened jconsole to monitor it : I noticed that the heap memory is constantly increasing until the GC triggers, and then it restarts increasing again and so on.
I tried to use a profiler to understand what is happening in the heap. All I found is a byte[] object that is constantly increasing. But no more details. Is there an explanation to that ? Is this behaviour inherent to Spark Streaming jobs ? Thanks for your help.