One thing to add is that you can always trigger a persistent checkpoint via the "savepoints" feature: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/savepoints.html
On Fri, Apr 15, 2016 at 10:24 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi, > for RocksDB we simply use a TypeSerializer to serialize the key and value > to a byte[] array and store that in RocksDB. For a ListState, we serialize > the individual elements using a TypeSerializer and store them in a > comma-separated list in RocksDB. The snapshots of RocksDB that we write to > HDFS are regular backups of a RocksDB database, as described here: > https://github.com/facebook/rocksdb/wiki/How-to-backup-RocksDB%3F. You > should be possible to read them from HDFS and restore them to a RocksDB > data base as described in the linked documentation. > > tl;dr As long as you know the type of values stored in the state you > should be able to read them from RocksDB and deserialize the values using > TypeSerializer. > > One more bit of information: Internally the state is keyed by (key, > namespace) -> value where namespace can be an arbitrary type that has a > TypeSerializer. We use this to store window state that is both local to key > and the current window. For state that you store in a user-defined function > the namespace will always be null and that will be serialized by a > VoidSerializer that simply always writes a "0" byte. > > Cheers, > Aljoscha > > On Fri, 15 Apr 2016 at 00:18 igor.berman <igor.ber...@gmail.com> wrote: > >> Hi, >> we are evaluating Flink for new solution and several people raised concern >> of coupling too much to Flink - >> 1. we understand that if we want to get full fault tolerance and best >> performance we'll need to use Flink managed state(probably RocksDB backend >> due to volume of state) >> 2. but then if we latter find that Flink doesn't answer our needs(for any >> reason) - we'll need to extract this state in some way(since it's the only >> source of consistent state) >> In general I'd like to be able to take snapshot of backend and try to read >> it...do you think it's will be trivial task? >> say If I'm holding list state per partitioned key, would it be easy to >> take >> RocksDb file and open it? >> >> any thoughts regarding how can I convince people in our team? >> >> thanks in advance! >> >> >> >> -- >> View this message in context: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Accessing-StateBackend-snapshots-outside-of-Flink-tp6116.html >> Sent from the Apache Flink User Mailing List archive. mailing list >> archive at Nabble.com. >> >