Dear Chen Qin: I am liuxinchun, and email is liuxinc...@huawei.com ( the email address in the "Copy To" is wrong). I have leave a message in FLINK-4266 using name SyinChwun Leo. We meet the similar problem in the applications. I hope we can develop this feature together. The following is my opinion:
(1) The organization form of current sliding window(SlidingProcessingTimeWindow and SlidingEventTimeWindow) have a drawback: When using ListState, a element may be kept in multiple windows (size / slide). It's time consuming and waste storage when checkpointing. Opinion: I think this is a optimal point. Elements can be organized according to the key and split(maybe also can called as pane). When triggering cleanup, only the oldest split(pane) can be cleanup. (2) Incremental backup strategy. In original idea, we plan to only backup the new coming element, and that means a whole window may span several checkpoints, and we have develop this idea in our private SPS. But in Flink, the window may not keep raw data(for example, ReducingState and FoldingState). The idea of Chen Qin maybe a candidate strategy. We can keep in touch and exchange our respective strategy. -----邮件原件----- 发件人: Chen Qin [mailto:c...@uber.com] 发送时间: 2017年1月17日 13:30 收件人: dev@flink.apache.org 抄送: iuxinc...@huawei.com; Aljoscha Krettek; shijinkui 主题: States split over to external storage Hi there, I would like to discuss split over local states to external storage. The use case is NOT another external state backend like HDFS, rather just to expand beyond what local disk/ memory can hold when large key space exceeds what task managers could handle. Realizing FLINK-4266 might be hard to tacking all-in-one, I would live give a shot to split-over first. An intuitive approach would be treat HeapStatebackend as LRU cache and split over to external key/value storage when threshold triggered. To make this happen, we need minor refactor to runtime and adding set/get logic. One nice thing of keeping HDFS to store snapshots would be avoid versioning conflicts. Once checkpoint restore happens, partial write data will be overwritten with previously checkpointed value. Comments? -- -Chen Qin