Re: Kafka streams - runs out of memory

2018-08-26 Thread Guozhang Wang
Hi Ashok, Definitely, please feel free to edit the FAQ page: https://cwiki.apache.org/confluence/display/KAFKA/FAQ It's a wiki, so anyone can contribute to it :) Guozhang On Sat, Aug 25, 2018 at 7:30 PM, AshokKumar J wrote: > Hi Guozhang, > > Thanks for the input. Yes, confirmed that enabl

Re: Kafka streams - runs out of memory

2018-08-25 Thread AshokKumar J
Hi Guozhang, Thanks for the input. Yes, confirmed that enabling and overriding the Rocks DB config setter class (with default parameters) in parallel to Kafka streams cache goes to indefinite memory usage. After removing the override, the application memory usage is consistent within 24GB. Can

Re: Kafka streams - runs out of memory

2018-08-22 Thread Guozhang Wang
Hi Ashok, Your implementation looks okay to me: I did not know how "handleTasks" is implemented, just that if you are iterating over the store, you'd need to close the iterator after used it. One thing I suspect is that your memory usage combing the streams cache plus rocksDB's own buffering may

Re: Kafka streams - runs out of memory

2018-08-19 Thread AshokKumar J
Hi Guozhang, Please find below. I have tried with the latest 2.0.0 libraries and no improvement observed. Kafka version - 1.0.1 Total Memory allocated - 24 GB Max Stream Cache - 8GB --- Processor class code: private KeyValueStore hourlyStore = null; // Loca

Re: Kafka streams - runs out of memory

2018-08-17 Thread Guozhang Wang
Hello AshokKumar, Which version of Kafka are you using? And could you share your code snippet for us to help investigate the issue (you can omit any concrete logic that involves your business logic, just the sketch of the code is fine). Guozhang On Fri, Aug 17, 2018 at 8:52 AM, AshokKumar J wr

Re: Kafka streams - runs out of memory

2018-08-17 Thread AshokKumar J
Hi, Any thoughts on the below issue? I think the behavior should be reproducible if we perform both the put, get from the store (cache enabled), when processing each record from the topic, with processing volume of 2-3 million records each 15 mins, each JSON on an average having 400 to 500 KB appr

Re: Kafka streams - runs out of memory

2018-08-15 Thread AshokKumar J
Disabling the stream cache prevents the unbounded memory usage, however the throughput is low (with ROCKSDB cache enabled). Can you please advise why the cache objects reference doesn't get released in time (for GC cleanup) and grows continuously? On Tue, Aug 14, 2018 at 11:17 PM, AshokKumar J w

Kafka streams - runs out of memory

2018-08-14 Thread AshokKumar J
Hi, we have a stream application that uses the low level API. We persist the data into the key value state store. For each record that we retrieve from the topic we perform a lookup against the store to see if it exists, if it does then we update the existing, else we simply add the new record.