Hi Stephan, Thank you for the clarification. Yes with RocksDB I don't see Full GC happening, also I am using Flink 1.2.0 version and I have set the statebackend in flink-conf.yaml file to rocksdb, so by default does this do asynchronous checkpointing or I have to specify it at the job level ?
Regards, Vinay Patil On Fri, Feb 10, 2017 at 4:16 PM, Stefan Richter [via Apache Flink User Mailing List archive.] <ml-node+s2336050n11565...@n4.nabble.com> wrote: > Hi, > > FSStateBackend operates completely on-heap and only snapshots for > checkpoints go against the file system. This is why the backend is > typically faster for small states, but can become problematic for larger > states. If your state exceeds a certain size, you should strongly consider > to use RocksDB as backend. In particular, RocksDB also offers asynchronous > snapshots which is very valuable to keep stream processing running for > large state. RocksDB works on native memory/disk, so there is no GC to > observe. For cases in which your state fits in memory but GC is a problem > you could try using the G1 garbage collector which offers better > performance for the FSStateBackend than the default. > > Best, > Stefan > > > Am 10.02.2017 um 11:16 schrieb Vinay Patil <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=11565&i=0>>: > > Hi, > > I am doing performance test for my pipeline keeping FSStateBackend, I have > observed frequent Full GC's after processing 20M records. > > When I did memory analysis using MAT, it showed that the many objects > maintained by Flink state are live. > > Flink keeps the state in memory even after checkpointing , when does this > state gets removed / GC. (I am using window operator in which the DTO comes > as input) > > Also why does Flink keep the state in memory after checkpointing ? > > P.S Using RocksDB is not causing Full GC at all. > > Regards, > Vinay Patil > > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/Frequent-Full-GC-s-in-case-of-FSStateBackend-tp11564p11565. > html > To start a new topic under Apache Flink User Mailing List archive., email > ml-node+s2336050n1...@n4.nabble.com > To unsubscribe from Apache Flink User Mailing List archive., click here > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx> > . > NAML > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Frequent-Full-GC-s-in-case-of-FSStateBackend-tp11564p11568.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.