Interaction of watermarks and windows

2020-06-21 Thread Sergii Mikhtoniuk
Greetings, When playing around with the following simple event-time stream aggregation: SELECT TUMBLE_START(event_time, INTERVAL '1' DAY) as event_time, ... FROM input GROUP BY TUMBLE(event_time, INTERVAL '1' DAY), symbol ...to my surprise I found out that the t

Unaligned Checkpoint and Exactly Once

2020-06-21 Thread Lu Weizheng
Hi there, The new feature in Flink 1.11 will provide us the Unaligned Checkpoint which means a operator subtask does not need to wait all the Checkpoint barrier and will not block some channels. As the Checkpoint barrier is the key mechanism for Exactly Once guarantee, I am not sure Unaligned C

State backend considerations

2020-06-21 Thread Nick Bendtner
Hi guys, I have a few questions on state backends. Is there a guideline on how big the state has to be where it makes sense to use RocksDB rather than FsStatebackend ? Is there an analysis on latency for a full checkpoint for FsSateBackend based on increase in state size ? Best, Nick.

Why side-outputs are only supported by Process functions?

2020-06-21 Thread ivneet kaur
Hi folks, I want to split my stream for some invalid message handling, and need help understanding a few things. Question 1: Why is *split *operator deprecated? Question 2: Why side-outputs are only supported for ProcessFunction, KeyedProcessFunction etc. The doc on side-outputs says: "*You can us

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-06-21 Thread OpenInx
Hi According to my observation in the hbase community, there are still lots of hbase users running their production cluster with version 1.x (1.4x or 1.5.x). so I'd like to suggest that supporting both hbase1.x & hbase2.x connector. Thanks. On Sat, Jun 20, 2020 at 2:41 PM Ming Li wrote: > +1 t

Re: Interaction of watermarks and windows

2020-06-21 Thread Benchao Li
Hi Sergii, The current watermark strategy is correct. The window's output is drived by watermark. Before when the window is triggered, the watermark which triggers it will be emitted after the result of the window has been fully emitted. Hence, the watermark won't outpace the right margin of the

Re: Interaction of watermarks and windows

2020-06-21 Thread Jark Wu
Hi Sergii, Window operator won't affect/adjust the output watermark, it just propagated as-is which is said in the document. I think the mistake here is you are using the wrong event-time of the window, actually, you should use TUMBLE_ROWTIME(...) as event_time [1]. The event-time of the window sh

Re: Unaligned Checkpoint and Exactly Once

2020-06-21 Thread Zhijiang
Hi Weizheng, The unaligned checkpoint (UC) only supports exactly-once mode in Flink 1.11 except savepoint mode. The savepoint is probably used in job rescaling scenario and we plan to support it in future release version. Of course UC can satisfy exactly-once semantic as promised. Regarding th

Re: adding s3 object metadata while using StreamFileSink

2020-06-21 Thread Yun Gao
Hi Dhurandar, With my understand I think what you need is to get notified when a file is written successfully (committed) on the S3 FileSystem. However, currently there is no public API for the listener and there an issue tracking it [1]. With the current version, one possible method co

回复: Unaligned Checkpoint and Exactly Once

2020-06-21 Thread Lu Weizheng
Thank you Zhijiang! The second question about config is just because I find a method in InputProcessorUtil. I guess AT_LEAST_ONCE mode is a simpler way to handle checkpont barrier? private static CheckpointBarrierHandler createCheckpointBarrierHandler( StreamConfig config, InputGat

Re: Unaligned Checkpoint and Exactly Once

2020-06-21 Thread Zhijiang
From implementation or logic complication perspective, the AT_LEAST_ONCE is somehow simpler compared with EXACTLY_ONCE w/o unaligned, since it can always process data without blocking any channels. -- From:Lu Weizheng Send Time:2

Re: pyflink的table api 能否根据 filed的值进行判断插入到不同的sink表中

2020-06-21 Thread jincheng sun
您好,jack: Table API 不用 if/else 直接用类似逻辑即可: val t1 = table.filter('x > 2).groupBy(..) val t2 = table.filter('x <= 2).groupBy(..) t1.insert_into("sink1) t2.insert_into("sink2") Best, Jincheng jack 于2020年6月19日周五 上午10:35写道: > > 测试使用如下结构: > table= t_env.from_path("source") > > if table.filter("