Stephan Ewen created FLINK-2808: ----------------------------------- Summary: Rework / Extend the StatehandleProvider Key: FLINK-2808 URL: https://issues.apache.org/jira/browse/FLINK-2808 Project: Flink Issue Type: Improvement Components: Streaming Affects Versions: 0.10 Reporter: Stephan Ewen Assignee: Stephan Ewen Fix For: 0.10
I would like to make some changes (mostly additions) to the {{StateHandleProvider}}. Ideally for the upcoming release, as it is somewhat part of the public API. The rational behind this is to handle in a nice and extensible way the creation of key/value state backed by various implementations (FS, distributed KV store, local KV store with FS backup, ...) and various checkpointing ways (full dump, append, incremental keys, ...) The changes would concretely be: 1. There should be a default {{StateHandleProvider}} set on the execution environment. Functions can later specify the {{StateHandleProvider}} when grabbing the {{StreamOperatorState}} from the runtime context (plus optionally a {{Checkpointer}}) 2. The {{StreamOperatorState}} is created from the {{StateHandleProvider}}. That way, a KeyValueStore state backend can create a {{StreamOperatorState}} that directly updates data in the KV store on every access, if that is desired (and filter accesses by timestamps to only show committed data) 3. The StateHandleProvider should have methods to get an output stream that writes to the state checkpoint directly (and returns a StateHandle upon closing). That way we can convert and dump large state into the checkpoint without crating a full copy in memory before. Lastly, I would like to change some names - {{StateHandleProvider}} to either {{StateBackend}}, {{StateStore}}, or {{StateProvider}} (simpler name). - {{StreamOperatorState}} to either {{State}} or {{KVState}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)