Stephan Ewen created FLINK-2808:
-----------------------------------
Summary: Rework / Extend the StatehandleProvider
Key: FLINK-2808
URL: https://issues.apache.org/jira/browse/FLINK-2808
Project: Flink
Issue Type: Improvement
Components: Streaming
Affects Versions: 0.10
Reporter: Stephan Ewen
Assignee: Stephan Ewen
Fix For: 0.10
I would like to make some changes (mostly additions) to the
{{StateHandleProvider}}. Ideally for the upcoming release, as it is somewhat
part of the public API.
The rational behind this is to handle in a nice and extensible way the creation
of key/value state backed by various implementations (FS, distributed KV store,
local KV store with FS backup, ...) and various checkpointing ways (full dump,
append, incremental keys, ...)
The changes would concretely be:
1. There should be a default {{StateHandleProvider}} set on the execution
environment. Functions can later specify the {{StateHandleProvider}} when
grabbing the {{StreamOperatorState}} from the runtime context (plus optionally
a {{Checkpointer}})
2. The {{StreamOperatorState}} is created from the {{StateHandleProvider}}.
That way, a KeyValueStore state backend can create a {{StreamOperatorState}}
that directly updates data in the KV store on every access, if that is desired
(and filter accesses by timestamps to only show committed data)
3. The StateHandleProvider should have methods to get an output stream that
writes to the state checkpoint directly (and returns a StateHandle upon
closing). That way we can convert and dump large state into the checkpoint
without crating a full copy in memory before.
Lastly, I would like to change some names
- {{StateHandleProvider}} to either {{StateBackend}}, {{StateStore}}, or
{{StateProvider}} (simpler name).
- {{StreamOperatorState}} to either {{State}} or {{KVState}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)