Re: Order of events in Broadcast State

2021-12-06 Thread Alexey Trenikhun
Thank you David From: David Anderson Sent: Monday, December 6, 2021 1:36:20 AM To: Alexey Trenikhun Cc: Flink User Mail List Subject: Re: Order of events in Broadcast State Event ordering in Flink is only maintained between pairs of events that take exactly the

Re: Table DataStream Conversion Lost Watermark

2021-12-06 Thread Yunfeng Zhou
Hi Timo, Thanks for this information. Since it is confirmed that toDataStream is functioning correctly and that I can avoid this problem by not using fromValues in my implementation, I think I have got enough information for my current work and don't need to rediscuss fromDatastream's behavior. B

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-12-06 Thread Natu Lauchande
Hey Timo and Flink community, I wonder if there is a fix for this issue. The last time I rollbacked to version 12 of Flink and downgraded Ververica. I am really keen to leverage the new features on the latest versions of Ververica 2.5+ , i have tried a myriad of tricks suggested ( example : build

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Dongwon Kim
When should I prepare for upgrading ZK to 3.5 or newer? We're operating a Hadoop cluster w/ ZK 3.4.6 for running only Flink jobs. Just hope that the rolling update is not that painful - any advice on this? Best, Dongwon On Tue, Dec 7, 2021 at 3:22 AM Chesnay Schepler wrote: > Current users of

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Yang Wang
FYI: We(Alibaba) are widely using ZooKeeper 3.5.5 for all the YARN and some K8s Flink high-available applications. Best, Yang Chesnay Schepler 于2021年12月7日周二 上午2:22写道: > Current users of ZK 3.4 and below would need to upgrade their Zookeeper > installation that is used by Flink to 3.5+. > > Wh

Re: Issue with incremental checkpointing size

2021-12-06 Thread Caizhi Weng
Hi! the checkpointing size is not going beyond 300 MB Is 300MB the total size of checkpoint or the incremental size of checkpoint? If it is the latter one, Flink will only store necessary information (for example the keys and the fields that are selected) in checkpoint and it is compressed, so f

Issue with incremental checkpointing size

2021-12-06 Thread Vidya Sagar Mula
Hi, In my project, we are trying to configure the "Incremental checkpointing" with RocksDB in the backend. We are using Flink 1.11 version and RockDB with AWS : S3 backend Issue: -- In my pipeline, my window size is 5 mins and the incremental checkpointing is happening for every 2 mins. I am

Re: [DISCUSS] Deprecate Java 8 support

2021-12-06 Thread Nicolás Ferrario
Oh my bad, it must be Statefun then. I remember I needed to play around with that for _some_ build. On Wed, Dec 1, 2021 at 7:48 PM Chesnay Schepler wrote: > Flink can be built with Java 11 since 1.10. If I recall correctly we > solved the tools.jar issue, which Hadoop depends on, by excluding th

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler
Current users of ZK 3.4 and below would need to upgrade their Zookeeper installation that is used by Flink to 3.5+. Whether K8s users are affected depends on whether they use ZK or not. If they do, see above, otherwise they are not affected at all. On 06/12/2021 18:49, Arvid Heise wrote: Cou

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-12-06 Thread Dawid Wysakowicz
Hi, Sorry to hear it's hard to find the option. It is part of the 1.14 release[1]. It is also documented how to enable it[2]. Happy to hear how we can improve the situation here. As for the exception. Are you seeing this exception occur repeatedly for the same task? I can imagine a situation tha

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Arvid Heise
Could someone please help me understand the implications of the upgrade? As far as I understood this upgrade would only affect users that have a zookeeper shared across multiple services, some of which require ZK 3.4-? A workaround for those users would be to run two ZKs with different versions, e

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-12-06 Thread James Sandys-Lumsdaine
Hello again, We recently upgraded from Flink 1.12.3 to 1.14.0 and we were hoping it would solve our issue with checkpointing with finished data sources. We need the checkpointing to work to trigger Flink's GenericWriteAheadSink class. Firstly, the constant mentioned on FLIP-147 that enables the

Re: Table DataStream Conversion Lost Watermark

2021-12-06 Thread Timo Walther
Hi Yunfeng, it seems this is a deeper issue with the fromValues implementation. Under the hood, it still uses the deprecated InputFormat stack. And as far as I can see, there we don't emit a final MAX_WATERMARK. I will definitely forward this. But toDataStream forwards watermarks correctly.

Re: enable.auto.commit=true and checkpointing turned on

2021-12-06 Thread Vishal Santoshi
perfect. Thanks. That is what I imagined. On Mon, Dec 6, 2021 at 2:04 AM Hang Ruan wrote: > Hi, > > 1. Yes, the kafka source will use the Kafka committed offset for the group > id to start the job. > > 2. No, the auto.offset.reset >

Re: [DISCUSS] Drop Zookeeper 3.4

2021-12-06 Thread Chesnay Schepler
ping @users; any input on how this would affect you is highly appreciated. On 25/11/2021 22:39, Chesnay Schepler wrote: I included the user ML in the thread. @users Are you still using Zookeeper 3.4? If so, were you planning to upgrade Zookeeper in the near future? I'm not sure about ZK comp

Re: use of Scala versions >= 2.13 in Flink 1.15

2021-12-06 Thread Chesnay Schepler
With regards to the Java APIs, you will definitely be able to use the Java DataSet/DataStream APIs from Scala without any restrictions imposed by Flink. This is already working with the current SNAPSHOT version. As we speak we are also working to achieve the same for the Table API; we expect t

Re: Re: Re: how to run streaming process after batch process is completed?

2021-12-06 Thread Yun Gao
Hi Joern, Very thanks for sharing the cases! Could you also share a bit more on the detailed scenarios~? Best, Yun --Original Mail -- Sender:Joern Kottmann Send Date:Fri Dec 3 16:43:38 2021 Recipients:Yun Gao CC:vtygoss , Alexander Preuß , user@flink.apache

[DISCUSS] Strong read-after-write consistency of Flink FileSystems

2021-12-06 Thread David Morávek
Hi Everyone, as outlined in FLIP-194 discussion [1], for the future directions of Flink HA services, I'd like to verify my thoughts around guarantees of the distributed filesystems used with Flink. Currently some of the services (*JobGraphStore*, *CompletedCheckpointStore*) are implemented using

use of Scala versions >= 2.13 in Flink 1.15

2021-12-06 Thread guenterh.lists
Dear list, there have been some discussions and activities in the last months about a Scala free runtime which should make it possible to use newer Scala version (>= 2.13 / 3.x) on the application side. Stephan Ewen announced the implementation is on the way [1] and Martijn Vissr mentioned i

Converting DataStream of Avro SpecificRecord to Table

2021-12-06 Thread Dongwon Kim
Hi community, I'm currently converting a DataStream of Avro SpecificRecord type into Table using the following method: public static Table toTable(StreamTableEnvironment tEnv, DataStream dataStream,

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-06 Thread Yingjie Cao
Hi Till, Thanks for your feedback. >>> How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? There are some tests that need to be adjusted, for example, BlockingShuffleITCase. For other tests, theoreti

Re: Order of events in Broadcast State

2021-12-06 Thread David Anderson
Event ordering in Flink is only maintained between pairs of events that take exactly the same path through the execution graph. So if you have multiple instances of A (let's call them A1 and A2), each broadcasting a partition of the total rule space, then one instance of B (B1) might receive rule1

Re: GCS/Object Storage Rate Limiting

2021-12-06 Thread David Morávek
Hi Kevin, Flink comes with two schedulers for streaming: - Default - Adaptive (opt-in) Adaptive is still in experimental phase and doesn't support local recover. You're most likely using the first one, so you should be OK. Can you elaborate on this a bit? We aren't changing the parallelism when

Re: Unable to create new native thread error

2021-12-06 Thread David Morávek
Hi Ilan, I think so, using CLI instead of REST API should solve this, as the user code execution would be pulled out to a separate JVM. If you're going to try that, it would be great to hear back whether it has solved your issue. As for 1.13.4, there is currently no on-going effort / concrete pla