Hi,Ashwin > What is the exact difference between checkpoint and state backend?
Ans: I can answer the first question you asked. Checkpoint is a mechanism that can make your program fault tolerant. Flink uses distributed snapshots implements checkpoint. But here is the question, where do I to store these states for my program ? Here is state backend comes. You can make your state to store in memory, filesystem, rocksdb. And the default is memory state backend. Please see more [1], [2] Cheers Minglei [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/internals/stream_checkpointing.html#introduction <https://ci.apache.org/projects/flink/flink-docs-release-1.5/internals/stream_checkpointing.html#introduction> [2] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html <https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html> > 在 2018年6月25日,上午3:05,Ashwin Sinha <ashwin.si...@go-mmt.com> 写道: > > Hi, > > We are using flink1.3.2 and trying to explore rocksdb state backend and > checkpointing. Data source is Kafka and checkpointing enabled in Flink. > We have few doubts regarding the same: > What is the exact difference between checkpoint and state backend? > Is the data stored in rocksdb checkpoints incremental(it keeps all past data > also in newer file)? New checkpoint is created after defined interval and > does it contains the previous checkpoint's data? Our use case demands all the > checkpoint data to be in a single db, but when we manually restart the job > it's id changes and new directory gets created(new metadata file in case of > savepoints). > What data does rocksdb stores inside in case of checkpoints? We are > interested in knowing whether it stores actual aggregations or it stores the > offsets metadata for an aggregation window? > If we run aggregations on past data, then will it take help of state backend > to not run aggregations again and give results by querying the state backend, > saving the processing time? > > -- > Ashwin Sinha | Data Engineer > ashwin.si...@go-mmt.com <mailto:shivam.sha...@go-mmt.com> | 9452075361 > <https://www.makemytrip.com/> <https://www.goibibo.com/> > <https://www.redbus.in/>2nd floor, Tower B Divyashree Technopolis Yemalur, > Bangalore, Karnataka 560025, India > <https://www.redbus.in/> > > > ::DISCLAIMER:: > > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > > > This message is intended only for the use of the addressee and may contain > information that is privileged, confidential and exempt from disclosure under > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering the message to the > intended recipient, you are hereby notified that any dissemination, > distribution or copying of this communication is strictly prohibited. If you > have received this e-mail in error, please notify us immediately by return > e-mail and delete this e-mail and all attachments from your system. >