Hi, key refers to the key extracted by your KeySelector. Right now, for every named state (i.e. the name in the StateDescriptor) there is a an isolated RocksDB instance.
Cheers, Aljoscha On Sat, 16 Apr 2016 at 15:43 Igor Berman <igor.ber...@gmail.com> wrote: > thanks a lot for the info, seems not too complex > I'll try to write simple tool to read this state. > > Aljoscha, does the key reflects unique id of operator in some way? Or key > is just a "name" that passed to ValueStateDescriptor. > > thanks in advance > > > On 15 April 2016 at 15:10, Stephan Ewen <se...@apache.org> wrote: > >> One thing to add is that you can always trigger a persistent checkpoint >> via the "savepoints" feature: >> https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/savepoints.html >> >> >> >> On Fri, Apr 15, 2016 at 10:24 AM, Aljoscha Krettek <aljos...@apache.org> >> wrote: >> >>> Hi, >>> for RocksDB we simply use a TypeSerializer to serialize the key and >>> value to a byte[] array and store that in RocksDB. For a ListState, we >>> serialize the individual elements using a TypeSerializer and store them in >>> a comma-separated list in RocksDB. The snapshots of RocksDB that we write >>> to HDFS are regular backups of a RocksDB database, as described here: >>> https://github.com/facebook/rocksdb/wiki/How-to-backup-RocksDB%3F. You >>> should be possible to read them from HDFS and restore them to a RocksDB >>> data base as described in the linked documentation. >>> >>> tl;dr As long as you know the type of values stored in the state you >>> should be able to read them from RocksDB and deserialize the values using >>> TypeSerializer. >>> >>> One more bit of information: Internally the state is keyed by (key, >>> namespace) -> value where namespace can be an arbitrary type that has a >>> TypeSerializer. We use this to store window state that is both local to key >>> and the current window. For state that you store in a user-defined function >>> the namespace will always be null and that will be serialized by a >>> VoidSerializer that simply always writes a "0" byte. >>> >>> Cheers, >>> Aljoscha >>> >>> On Fri, 15 Apr 2016 at 00:18 igor.berman <igor.ber...@gmail.com> wrote: >>> >>>> Hi, >>>> we are evaluating Flink for new solution and several people raised >>>> concern >>>> of coupling too much to Flink - >>>> 1. we understand that if we want to get full fault tolerance and best >>>> performance we'll need to use Flink managed state(probably RocksDB >>>> backend >>>> due to volume of state) >>>> 2. but then if we latter find that Flink doesn't answer our needs(for >>>> any >>>> reason) - we'll need to extract this state in some way(since it's the >>>> only >>>> source of consistent state) >>>> In general I'd like to be able to take snapshot of backend and try to >>>> read >>>> it...do you think it's will be trivial task? >>>> say If I'm holding list state per partitioned key, would it be easy to >>>> take >>>> RocksDb file and open it? >>>> >>>> any thoughts regarding how can I convince people in our team? >>>> >>>> thanks in advance! >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Accessing-StateBackend-snapshots-outside-of-Flink-tp6116.html >>>> Sent from the Apache Flink User Mailing List archive. mailing list >>>> archive at Nabble.com. >>>> >>> >> >