What I've noticed is that heap memory ends up growing linearly with time 
indefinitely (past 24 hours) until it hits the roof of the allocated heap for 
the task manager, which leads me to believe I am leaking somewhere. All of my 
windows have an allowed lateness of 5 minutes, and my watermarks are pulled 
from time embedded in the records using 
BoundedOutOfOrdernessTimestampExtractors. My TumblingEventTimeWindows and 
SlidingEventTimeWindow all use AggregateFunctions, and my intervalJoins use 
ProcessJoinFunctions.

I expect this app to use a significant amount of memory at scale due to the 288 
5-minute intervals in 24 hours, and records being put in all 288 window states, 
and as the application runs for 24 hours memory would increase as all 
288(*unique key) windows build with incoming records, but then after 24 hours 
the memory should stop growing, or at least grow at a different rate?

Also of note, we are using a FsStateBackend configuration, and plan to move to 
RocksDBStateBackend, but from what I can tell, this would only reduce memory 
and delay hitting the heap memory capacity, not stall it forever?

Thanks
Chris


On 5/18/20, 7:29 AM, "Aljoscha Krettek" <aljos...@apache.org> wrote:

    On 15.05.20 15:17, Slotterback, Chris wrote:
    > My understanding is that while all these windows build their memory 
state, I can expect heap memory to grow for the 24 hour length of the 
SlidingEventTimeWindow, and then start to flatten as the t-24hr window frames 
expire and release back to the JVM. What is actually happening is when a 
constant data source feeds the stream, the heap memory profile grows linearly 
past the 24 hour mark. Could this be a result of a misunderstanding of how the 
window’s memory states are kept, or is my assumption correct, and it is more 
likely I have a leak somewhere?

    Will memory keep growing indefinitely? That would indicate a bug? What
    sort of lateness/watermark settings do you have? What window function do
    you use? ProcessWindowFunction, or sth that aggregates?

    Side note: with sliding windows of 24h/5min you will have a "write
    amplification" of 24*60/5=288, each record will be in 288 windows, which
    will each be kept in separate state?

    Best,
    Aljoscha


Reply via email to