Re: context.timestamp null in keyedprocess function

2022-06-14 Thread bat man
Has anyone experienced this or has any clue? On Tue, Jun 14, 2022 at 6:21 PM bat man wrote: > Hi, > > We are using flink 12.1 on AWS EMR. The job reads the event stream and > enrich stream from another topic. > We extend AssignerWithPeriodicWatermarks to assign watermarks and extract > timestamp

Re: Spike in checkpoint start delay every 15 minutes

2022-06-14 Thread Hangxiang Yu
Hi, Jai. Could you share your configuration about the checkpoint (interval, min-pause, and so on) and the checkpoint details in the Flink UI ? I guess the delay of the checkpoint may be related to the last checkpoint completion time as you could see in the CheckpointRequestDecider#chooseRequestToE

Spike in checkpoint start delay every 15 minutes

2022-06-14 Thread Jai Patel
We've noticed a spike in the start delays in our incremental checkpoints every 15 minutes. The Flink job seems to start out smooth, with checkpoints in in the 15s range and negligible start delays. Then every 3rd or 4th checkpoint has a long start delay (~2-3 minutes). Teh checkpoints in between

Re: New KafkaSource API : Change in default behavior regarding starting offset

2022-06-14 Thread Jing Ge
Hi Bastien, Thanks for asking. I didn't find any call of setStartFromGroupOffsets() within Flink in the master branch. Could you please point out the code that committed offset is used as default? W.r.t. the new KafkaSource, if OffsetsInitializer.committedOffsets() is used, an exception will be t

Re: Apache Flink - Reading data from Scylla DB

2022-06-14 Thread Jing Ge
Hi, Please be aware that SourceFunction will be deprecated soon[1]. It is recommended to build a new source connector based on the new Source API design by FLIP-27[2]. You might take the Kafka connector as the reference implementation. Best regards, Jing [1] https://lists.apache.org/thread/d6cwq

Re: NegativeArraySizeException trying to take a savepoint

2022-06-14 Thread Mike Barborak
Thank you for your replies. Upgrading is in our plans but I think Yun is saying that might not help. We are still trying to find what part of the savepoint is causing the error. We will try removing pieces of the job graph until we are able to savepoint. From: Yun Tang Date: Tuesday, June 14,

Re: Kafka Consumer commit error

2022-06-14 Thread Martijn Visser
Hi Christian, There's another similar error reported by someone else. I've linked the tickets together and asked one of the Kafka maintainers to have a look at this. Best regards, Martijn Op di 14 jun. 2022 om 17:16 schreef Christian Lorenz < christian.lor...@mapp.com>: > Hi Alexander, > > > >

Re: Kafka Consumer commit error

2022-06-14 Thread Christian Lorenz
Hi Alexander, I’ve created a Jira ticket here https://issues.apache.org/jira/browse/FLINK-28060. Unfortunately this is causing some issues to us. I hope with the attached demo project the root cause of this can also be determined, as this is reproducible in Flink 1.15.0, but not in Flink 1.14.4.

New KafkaSource API : Change in default behavior regarding starting offset

2022-06-14 Thread bastien dine
Hello everyone, Does someone know why the starting offset behaviour has changed in the new Kafka Source ? This is now from earliest (code in KafkaSourceBuilder), doc says : "If offsets initializer is not specified, OffsetsInitializer.earliest() will be used by default." from : https://nightlies.a

Re: Flink running same task on different Task Manager

2022-06-14 Thread Weihua Hu
Hi, IMO, Broadcast is a better way to do this, which can reduce the QPS of external access. If you do not want to use Broadcast, Try using RichFunction, start a thread in the open() method to refresh the data regularly. but be careful to clean up your data and threads in the close() method, otherw

Re: NegativeArraySizeException trying to take a savepoint

2022-06-14 Thread Yun Tang
Hi Mike, I think the root cause is that the size of java bytes array still exceed VM limit. The exception message is not friendly and not covered by sanity check [1] as it uses different code path [2]: The native method org.rocksdb.RocksIterator.$$YJP$$value0 would allocate the byte array direc

context.timestamp null in keyedprocess function

2022-06-14 Thread bat man
Hi, We are using flink 12.1 on AWS EMR. The job reads the event stream and enrich stream from another topic. We extend AssignerWithPeriodicWatermarks to assign watermarks and extract timestamp from the event and handle idle source partitions. AutoWatermarkInterval set to 5000L. The timestamp extr

Re: How to handle deletion of items using PyFlink SQL?

2022-06-14 Thread John Tipper
Yes, I’m interested in the best pattern to follow with SQL to allow for a downstream DB using the JDBC SQL connector to reflect the state of rows added and deleted upstream. So imagine there is a crawl event at t=C1 that happens with an associated timestamp and which finds resources A,B,C. Is i

Re: Flink operator deletes the FlinkDeplyoment after a while

2022-06-14 Thread Gyula Fóra
Hi Sigalit, This could be related to https://issues.apache.org/jira/browse/FLINK-27889 We have fixed this issue already (after the release), you could simply use the latest operator image from of `release-1.0: *ghcr.io/apache/flink-kubernetes-operator:cc8207c