Re: Accessing StateBackend snapshots outside of Flink

2016-11-03 Thread Aljoscha Krettek
Hi, there are two open issues about this: * https://issues.apache.org/jira/browse/FLINK-3946 * https://issues.apache.org/jira/browse/FLINK-3089 no work was done on this yet. You can, however, simulate TTL for state by using a TimelyFlatMapFunction and manually setting a timer for clearing out st

Re: Accessing StateBackend snapshots outside of Flink

2016-11-02 Thread bwong247
We're currently investigating Flink, and one of the features that we'd like to have is a TTL feature to time out older values in state. I saw this thread and it sounds like the functionality was being considered. Is there any update? -- View this message in context: http://apache-flink-use

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Maximilian Michels
+1 to what Aljoscha said. We should rather fix this programmatically. On Mon, Jun 13, 2016 at 4:25 PM, Aljoscha Krettek wrote: > Hi Josh, > I think RocksDB does not allow accessing a data base instance from more than > one process concurrently. Even if it were possible I would highly recommend >

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Aljoscha Krettek
Hi Josh, I think RocksDB does not allow accessing a data base instance from more than one process concurrently. Even if it were possible I would highly recommend not to fiddle with Flink state internals (in RocksDB or elsewhere) from the outside. All kinds of things might be going on at any given m

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Maximilian Michels
Hi Josh, I'm not a RocksDB expert but the workaround you described should work. Just bear in mind that accessing RocksDB concurrently with a Flink job can result in an inconsistent state. Make sure to perform atomic updates and clear the RocksDB cache for the item. Cheers, Max On Mon, Jun 13, 20

Re: Accessing StateBackend snapshots outside of Flink

2016-06-13 Thread Josh
Hello, I have a follow-up question to this: since Flink doesn't support state expiration at the moment (e.g. expiring state which hasn't been updated for a certain amount of time), would it be possible to clear up old UDF states by: - store a 'last_updated" timestamp in the state value - periodical

Re: Accessing StateBackend snapshots outside of Flink

2016-04-18 Thread Aljoscha Krettek
Hi, key refers to the key extracted by your KeySelector. Right now, for every named state (i.e. the name in the StateDescriptor) there is a an isolated RocksDB instance. Cheers, Aljoscha On Sat, 16 Apr 2016 at 15:43 Igor Berman wrote: > thanks a lot for the info, seems not too complex > I'll tr

Re: Accessing StateBackend snapshots outside of Flink

2016-04-16 Thread Igor Berman
thanks a lot for the info, seems not too complex I'll try to write simple tool to read this state. Aljoscha, does the key reflects unique id of operator in some way? Or key is just a "name" that passed to ValueStateDescriptor. thanks in advance On 15 April 2016 at 15:10, Stephan Ewen wrote: >

Re: Accessing StateBackend snapshots outside of Flink

2016-04-15 Thread Stephan Ewen
One thing to add is that you can always trigger a persistent checkpoint via the "savepoints" feature: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/savepoints.html On Fri, Apr 15, 2016 at 10:24 AM, Aljoscha Krettek wrote: > Hi, > for RocksDB we simply use a TypeSer

Re: Accessing StateBackend snapshots outside of Flink

2016-04-15 Thread Aljoscha Krettek
Hi, for RocksDB we simply use a TypeSerializer to serialize the key and value to a byte[] array and store that in RocksDB. For a ListState, we serialize the individual elements using a TypeSerializer and store them in a comma-separated list in RocksDB. The snapshots of RocksDB that we write to HDFS