Hi,
for RocksDB we simply use a TypeSerializer to serialize the key and value
to a byte[] array and store that in RocksDB. For a ListState, we serialize
the individual elements using a TypeSerializer and store them in a
comma-separated list in RocksDB. The snapshots of RocksDB that we write to
HDFS are regular backups of a RocksDB database, as described here:
https://github.com/facebook/rocksdb/wiki/How-to-backup-RocksDB%3F. You
should be possible to read them from HDFS and restore them to a RocksDB
data base as described in the linked documentation.

tl;dr As long as you know the type of values stored in the state you should
be able to read them from RocksDB and deserialize the values using
TypeSerializer.

One more bit of information: Internally the state is keyed by (key,
namespace) -> value where namespace can be an arbitrary type that has a
TypeSerializer. We use this to store window state that is both local to key
and the current window. For state that you store in a user-defined function
the namespace will always be null and that will be serialized by a
VoidSerializer that simply always writes a "0" byte.

Cheers,
Aljoscha

On Fri, 15 Apr 2016 at 00:18 igor.berman <igor.ber...@gmail.com> wrote:

> Hi,
> we are evaluating Flink for new solution and several people raised concern
> of coupling too much to Flink -
> 1. we understand that if we want to get full fault tolerance and best
> performance we'll need to use Flink managed state(probably RocksDB backend
> due to volume of state)
> 2. but then if we latter find that Flink doesn't answer our needs(for any
> reason) - we'll need to extract this state in some way(since it's the only
> source of consistent state)
> In general I'd like to be able to take snapshot of backend and try to read
> it...do you think it's will be trivial task?
> say If I'm holding list state per partitioned key, would it be easy to take
> RocksDb file and open it?
>
> any thoughts regarding how can I convince people in our team?
>
> thanks in advance!
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Accessing-StateBackend-snapshots-outside-of-Flink-tp6116.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Reply via email to