On efficient checkpoints with dynamic (self-evolving) keyed state

2020-04-05 Thread Salva Alcántara
In a KeyedCoProcessFunction, I am managing a keyed state which consists of third-party library models. These models are created on reception of new data on the control stream within `processElement1`. Because the models are self-evolving, in the sense that have their own internal state, I need to m

Re: Perform processing only when watermark updates, buffer data otherwise

2020-04-05 Thread Manas Kale
Hi Timo, Thanks for the information. On Thu, Apr 2, 2020 at 9:30 PM Timo Walther wrote: > Hi Manas, > > first of all, after assigning watermarks at the source level, usually > Flink operators make sure to handle the watermarks. > > In case of a `union()`, the subsequent operator will increment i

[ANNOUNCE] Weekly Community Update 2020/14

2020-04-05 Thread Konstantin Knauf
Dear community, happy to share this week's community update with Flink on Zeppelin, full support for VIEWs in Flink SQL, Ververica Platform Community Edition and a bit more. Flink Development == * [releases] Four issues (3 Blockers, 1 Critical) left for Flink 1.10.1 at this point. [1

Storing Operator state in RocksDb during runtime - plans

2020-04-05 Thread KristoffSC
Hi, according to [1] operator state and broadcast state (which is a "special" type of operator state) are not stored in RocksDb during runtime when RocksDb is choosed as state backend. Are there any plans to change this? I'm working around a case where I will have a quite large control stream.

Re: State & Generics

2020-04-05 Thread Laurent Exsteens
Hi Aljoscha, Thank you for your answer! Out of curiosity, would writing my own serializer involve implementing a serialisation for every your I could get? On Wed, Apr 1, 2020, 13:57 Aljoscha Krettek wrote: > Hi Laurent! > > On 31.03.20 10:43, Laurent Exsteens wrote: > > Yesterday I managed to

Flink job getting killed

2020-04-05 Thread Giriraj Chauhan
Hi, We are submitting a flink(1.9.1) job for data processing. It runs fine and processes data for sometime i.e. ~30 mins and later it throws following exception and job gets killed. 2020-04-02 14:15:43,371 INFO org.apache.flink.runtime.taskmanager.Task - Sink: Unnamed (2/4) (45

Re: Using multithreaded library within ProcessFunction with callbacks relying on the out parameter

2020-04-05 Thread Salva Alcántara
Hi again Piotr, I have further considered the mailbox executor approach and I think it will not be enough for my purposes. Here is why: - My state consists of models created with a third party library - These models have their own state, which means that when I forward events in `ProcessElement1`

Re: Conversion of Table (Blink/batch) to DataStream

2020-04-05 Thread Maciek Próchniak
Hi Jark, thanks for quick answer - I strongly suspected there is a hack like that somewhere - but couldn't find it easily in the maze of old and new scala and java APIs :D For my current experiments it's ok, I'm sure in next releases everything will be cleaned up :) best, maciek On 05

Re: Using MapState clear, put methods in snapshotState within KeyedCoProcessFunction, valid or not?

2020-04-05 Thread Salva Alcántara
Hi Yun, In the end, I left the code like this ``` override def snapshotState(context: FunctionSnapshotContext): Unit = { for ((k, model) <- models) { modelsBytes.put(k, model.toBytes(v)) } } ``` I have verified with a simple test that most of the times checkpoints seem to work fine

watermark not advancing when reading kinesis historical data

2020-04-05 Thread Fanbin Bu
Hi, i've been debugging this issue for several days now and still cant get it to work. I need to read the kinesis historical data (7 days) using Flink SQL. Here is my setup: Flink version: 1.9.1 kinesis shard number: 32 Flink parallelism: 32 sql: select * from mytable (i purposely make this trivi