Re: State Storage Questions

2020-09-07 Thread Tzu-Li (Gordon) Tai
Hi! Operator state is bound to a single parallel operator instance; there is no partitioning happening here. It is typically used in Flink source and sink operators. For example, the Flink Kafka source operator's parallel instances maintain as operator state a mapping of partitions to offsets for

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-07 Thread Husky Zeng
Hi Cristian, I don't know if it was designed to be like this deliberately. So I have already submitted an issue ,and wait for somebody to response. https://issues.apache.org/jira/browse/FLINK-19154 -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

[ANNOUNCE] Weekly Community Update 2020/36

2020-09-07 Thread Konstantin Knauf
Dear community, happy to share another community update for the past week. This time with the upcoming release of Flink 1.11.2, a proposal for more efficient aggregation for batch processing with the DataStream API, and the comeback of two FLIPs that have been abandoned for a bit. Flink Developme

Should the StreamingFileSink mark the files "finished" when all bounded input sources are depleted?

2020-09-07 Thread Teunissen, F.G.J. (Fred)
Hi All, My flink-job is using bounded input sources and writes the results to a StreamingFileSink. When it has processed all the input the job is finished and closes. But the output files are still named “-0-0..inprogress.”. I expected them to be named ““-0-0.”. Did I forget some setting or so

Re: FLINK DATASTREAM Processing Question

2020-09-07 Thread Vijayendra Yadav
Thank You Dawid. Sent from my iPhone > On Sep 7, 2020, at 9:03 AM, Dawid Wysakowicz wrote: >

Re: Should the StreamingFileSink mark the files "finished" when all bounded input sources are depleted?

2020-09-07 Thread Ken Krugler
Hi Fred, I think this is the current behavior (though it would be helpful to know which version of Flink you’re using). From an email conversation with Kostas in January of this year: > Hi Ken, Jingsong and Li, > > Sorry for the late reply. > > As Jingsong pointed out, upon calling close() th

Re: FLINK DATASTREAM Processing Question

2020-09-07 Thread Dawid Wysakowicz
Hi, You can see the execution plan via StreamExecutionEnvironment#getExecutionPlan(). You can visualize it in[1]. You can also submit your job and check the execution plan in Web UI. As for the question which option is preferred it is very subjective. As long as in the option b) both maps are cha

Re: Unit Test for KeyedProcessFunction with out-of-core state

2020-09-07 Thread Dawid Wysakowicz
Hi Alexey, There is no mock for RocksDB. Moreover I am not sure what would be the use case for one. If you want to test specifically against RocksDB then you can use it in the test harness Gordon mentioned. On 04/09/2020 16:31, Alexey Trenikhun wrote: > Hi Gordon, > We already use [1]. Unfortunat

Flink alert after database lookUp

2020-09-07 Thread s_penakalap...@yahoo.com
Hi All, I am new to Flink, request your help!!! My scenario : 1> we receive Json messages at a very high frequency like 10,000 messages / second2> we need to raise an Alert for a particular user if there is any breach in threshold value against each attribute in Json.3> These threshold values ar

Re: How to access state in TimestampAssigner in Flink 1.11?

2020-09-07 Thread Theo Diefenthal
Hi Aljoscha, We have a ProcessFunction which does some processing per kafka partition. It basically buffers the incoming data over 1 minute and throws out some events from the stream if within the minute another related event arrived. In order to buffer the data and store the events over 1 min

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-07 Thread Cristian
That's an excellent question. I can't explain that. All I know is this: - the job was upgraded and resumed from a savepoint - After hours of working fine, it failed (like it shows in the logs) - the Metadata was cleaned up, again as shown in the logs - because I run this in Kubernetes, the conta

Flink stateful functions : compensating callback to invoked functions on a timeout

2020-09-07 Thread Mazen_Ezzeddine
Hi all, I am implementing a use case in Flink stateful functions. My specification highlights that starting from a stateful function f a business workflow (in other words a group of stateful functions f1, f2, … fn are called either sequentially or in parallel or both ). Stateful function f waits f

Re: How to access state in TimestampAssigner in Flink 1.11?

2020-09-07 Thread Aljoscha Krettek
Hi, sorry for the inconvenience! I'm sure we can find a solution together. Why do you need to keep state in the Watermark Assigner? The Kafka source will by itself maintain the watermark per partition, so just specifying a WatermarkStrategy will already correctly compute the watermark per par

Re: Checkpoint metadata deleted by Flink after ZK connection issues

2020-09-07 Thread Husky Zeng
I means that checkpoints are usually dropped after the job was terminated by the user (except if explicitly configured as retained Checkpoints). You could use "ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION" to save your checkpoint when te cames to failure. When your zookeeper lost connect