Flink, Kafka -> Flink Data flow

2025-04-30 Thread George
Hi all Trying to stitch together the below. I first create a table in flink, backed by a hive catalog, data sourced from kafka, as per below, The below create table worked... First, for reference ts is a bigint value representing the timestamp of the event. CREATE OR REPLACE TABLE hive_catalog.

Re: Flink kafka source with OAuth authentication token refresh

2025-02-11 Thread Nikola Milutinovic
. Nix. From: Kamal Mittal via user Date: Tuesday, February 11, 2025 at 12:17 PM To: user@flink.apache.org Subject: Flink kafka source with OAuth authentication token refresh Hello, I am having a scenario where in flink kafka source (org.apache.flink.connector.kafka.source.KafkaSource) after

Flink kafka source with OAuth authentication token refresh

2025-02-11 Thread Kamal Mittal via user
Hello, I am having a scenario where in flink kafka source (org.apache.flink.connector.kafka.source.KafkaSource) after fetching kafka record, there is need for OAuth authentication for one of the 3rd party REST API. This API needs authentication token and has some expiry time associated with

Re: Flink 1.20 and Flink Kafka Connector 3.2.0-1.19

2024-10-03 Thread Dominik.Buenzli
Data, Analytics & AI Engineer III From: patricia lee Date: Thursday, 3 October 2024 at 15:27 To: user@flink.apache.org Subject: Flink 1.20 and Flink Kafka Connector 3.2.0-1.19 Be aware: This is an external email. Hi, I have upgraded our project to Flink 1.20 and JDK 17. But I noticed ther

Flink 1.20 and Flink Kafka Connector 3.2.0-1.19

2024-10-03 Thread patricia lee
Hi, I have upgraded our project to Flink 1.20 and JDK 17. But I noticed there is no Kafka connector for Flink 1.20. I currently used the versions but there is intermittent error of Kafka related No Class Definition Error Where can I get the Kafka connector for flink 1.20? Thanks

Re: flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Nicolas Paris
Thanks, missed that warning ! All right

Re: flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Ahmed Hamdy
Hi Nicholas, Could you elaborate what you think is missing? I can see there is a warning that EXACTLY_ONCE sink wouldn't operate. > It is important to remember that because there are no checkpoints, certain features such as CheckpointListener

flink kafka sink batch mode delivery guaranties limitations

2024-08-23 Thread Nicolas Paris
hi >From my tests kafka sink in exactly-once and batch runtime will never commit the transaction, leading to not honour the semantic. This is likely by design since records are ack/commited during a checkpoint, which never happens in batch mode. I am missing something or the documentation should w

Re: Re: Re: Flink kafka connector for v 1.19.0

2024-05-17 Thread Niklas Wilcke
t; Best Regards >>> Ahmed Hamdy >>> >>> >>> On Fri, 10 May 2024 at 18:14, Aniket Sule >> <mailto:aniket.s...@netwitness.com>> wrote: >>>> Hello, >>>> >>>> On the Flink downloads page, the latest stable

Re: [EXTERNAL]Re: Flink kafka connector for v 1.19.0

2024-05-16 Thread Hang Ruan
et Sule > wrote: > >> Hello, >> >> On the Flink downloads page, the latest stable version is Flink 1.19.0. >> However, the Flink Kafka connector is v 3.1.0, that is compatible with >> 1.18.x. >> >> Is there a timeline when the Kafka connector fo

Re: [EXTERNAL]Re: Flink kafka connector for v 1.19.0

2024-05-16 Thread Niklas Wilcke
e: >> Hello, >> >> On the Flink downloads page, the latest stable version is Flink 1.19.0. >> However, the Flink Kafka connector is v 3.1.0, that is compatible with >> 1.18.x. >> >> Is there a timeline when the Kafka connector for v 1.19 will be re

Re: Flink kafka connector for v 1.19.0

2024-05-10 Thread Ahmed Hamdy
at 18:14, Aniket Sule wrote: > Hello, > > On the Flink downloads page, the latest stable version is Flink 1.19.0. > However, the Flink Kafka connector is v 3.1.0, that is compatible with > 1.18.x. > > Is there a timeline when the Kafka connector for v 1.19 will be released? &g

Flink kafka connector for v 1.19.0

2024-05-10 Thread Aniket Sule
Hello, On the Flink downloads page, the latest stable version is Flink 1.19.0. However, the Flink Kafka connector is v 3.1.0, that is compatible with 1.18.x. Is there a timeline when the Kafka connector for v 1.19 will be released? Is it possible to use the v3.1.0 connector with Flink v 1.19

RE: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Kirti Dhar Upadhyay K via user
Thanks Jiabao and Yaroslav for your quick responses. Regards, Kirti Dhar From: Yaroslav Tkachenko Sent: 01 February 2024 21:42 Cc: user@flink.apache.org Subject: Re: RE: Flink Kafka Sink + Schema Registry + Message Headers The schema registry support is provided in

Re: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Yaroslav Tkachenko
gt;> correct Flink version? >> > >> > Also, any help on question 1 regarding Schema Registry? >> > >> > Regards, >> > Kirti Dhar >> > >> > -Original Message- >> > From: Jiabao Sun >> > Sent: 01

Re: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Yaroslav Tkachenko
Also, any help on question 1 regarding Schema Registry? > > > > Regards, > > Kirti Dhar > > > > -Original Message- > > From: Jiabao Sun > > Sent: 01 February 2024 13:29 > > To: user@flink.apache.org > > Subject: RE: Fli

RE: RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Jiabao Sun
al Message- > From: Jiabao Sun > Sent: 01 February 2024 13:29 > To: user@flink.apache.org > Subject: RE: Flink Kafka Sink + Schema Registry + Message Headers > > Hi Kirti, > > Kafka Sink supports sending messages with headers. > You should implement a HeaderPr

RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-02-01 Thread Kirti Dhar Upadhyay K via user
From: Jiabao Sun Sent: 01 February 2024 13:29 To: user@flink.apache.org Subject: RE: Flink Kafka Sink + Schema Registry + Message Headers Hi Kirti, Kafka Sink supports sending messages with headers. You should implement a HeaderProvider to extract headers from input element. KafkaSink

RE: Flink Kafka Sink + Schema Registry + Message Headers

2024-01-31 Thread Jiabao Sun
lements it return null; } }) .build() ) .build(); Best, Jiabao On 2024/02/01 07:46:38 Kirti Dhar Upadhyay K via user wrote: > Hi Mates, > > I have below queries regarding Flink Kafka Sink. > >

Flink Kafka Sink + Schema Registry + Message Headers

2024-01-31 Thread Kirti Dhar Upadhyay K via user
Hi Mates, I have below queries regarding Flink Kafka Sink. 1. Does Kafka Sink support schema registry? If yes, is there any documentations to configure the same? 2. Does Kafka Sink support sending messages (ProducerRecord) with headers? Regards, Kirti Dhar

Re: [ANNOUNCE] Apache Flink Kafka Connectors 3.0.2 released

2023-12-04 Thread Martijn Visser
fka/commit/6c3d3d06689336f2fd37bfa5a3b17a5377f07887 On Sat, Dec 2, 2023 at 1:57 AM Tzu-Li (Gordon) Tai wrote: > > The Apache Flink community is very happy to announce the release of Apache > Flink Kafka Connectors 3.0.2. This release is compatible with the Apache > Flink 1.17.x and 1.18.x release series. > &

[ANNOUNCE] Apache Flink Kafka Connectors 3.0.2 released

2023-12-01 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink Kafka Connectors 3.0.2. This release is compatible with the Apache Flink 1.17.x and 1.18.x release series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always

[ANNOUNCE] Apache Flink Kafka Connectors 3.0.1 released

2023-10-31 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink Kafka Connectors 3.0.1. This release is compatible with the Apache Flink 1.17.x and 1.18.x release series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always

Re: Flink Kafka offset commit issues

2023-10-01 Thread elakiya udhayanan
Hi Hangxiang, Thanks for providing me the steps to check if the checkpointing is getting triggered on failure recovery. I will follow them and respond back in case of any issues. Thanks, Elakiya On Sat, Sep 30, 2023 at 2:34 PM Hangxiang Yu wrote: > Hi, Elakiya. > > I think you could check : >

Re: Flink Kafka offset commit issues

2023-09-30 Thread Hangxiang Yu
Hi, Elakiya. I think you could check : 1. The TaskManager Log to figure out whether the job is restoring from an existing checkpoint and the restoring checkpoint path. 2. Or you could check the checkpoint ID when you restart your job (If not restoring from a checkpoint, it starts from

Re: Flink Kafka offset commit issues

2023-09-28 Thread elakiya udhayanan
Hi Feng, Thanks for your response. 1. We have configured checkpointing to upload to a s3 location, also we see metadata files getting created in the s3 location. But we are unsure if the job is getting triggered from that checkpoint in case of failure. Is there a possible way to test this. Also d

Re: Flink Kafka offset commit issues

2023-09-28 Thread Feng Jin
Hi Elakiya 1. You can confirm if the checkpoint for the task has been triggered normally? 2. Also, If you stop the job, you need to use "STOP WITH SAVEPOINT" and specify the path to the savepoint when starting the Flink job for recovery. This is necessary to continue consuming from the historical

Flink Kafka offset commit issues

2023-09-28 Thread elakiya udhayanan
Hi team, I have a Kafka topic named employee which uses confluent avro schema and will emit the payload as below: { "id": "emp_123456", "employee": { "id": "123456", "name": "sampleName" } } I am using the upsert-kafka connector to consume the events from the above Kafka topic as below using the F

Re: Flink Kafka source getting marked as Idle

2023-06-17 Thread Anirban Dutta Gupta
xisting thread for my question. Actually we are also facing the issue of the Flink Kafka source stopping consuming messages completely. It only started consuming messages after we re-submitted the Job. But this happened only once and now it is not getting reproduced at all.

Re: Flink Kafka source getting marked as Idle

2023-06-17 Thread Tzu-Li (Gordon) Tai
eports a new watermark (e.g. when new data is produced). Best, Gordon On Fri, Jun 16, 2023 at 7:10 AM Anirban Dutta Gupta < anir...@indicussoftware.com> wrote: > Hello All, > > Sorry to be replying to an existing thread for my question. Actually we > are also facing the issue

Flink Kafka source getting marked as Idle

2023-06-16 Thread Anirban Dutta Gupta
Hello All, Sorry to be replying to an existing thread for my question. Actually we are also facing the issue of the Flink Kafka source stopping consuming messages completely. It only started consuming messages after we re-submitted the Job. But this happened only once and now it is not

Re: Flink Kafka Source rebalancing - 1.14.2

2023-05-09 Thread Madan D via user
Hello Shammon/Team, We are using same Flink version which is 1.14.2 but only change is earlier we were using Flink Kafka Consumer not we are moving with Kafka Source. I dont see any difference in Job planner but I see Kafka source is introducing more latency while performing Flinka Kafka

Re: Flink Kafka Source rebalancing - 1.14.2

2023-05-09 Thread Shammon FY
Hi Madan, Could you give the old and new versions of flink and provide the job plan? I think it will help community to find the root cause Best, Shammon FY On Wed, May 10, 2023 at 2:04 AM Madan D via user wrote: > Hello Team, > > We have been using Flink Kafka consumer and recentl

Flink Kafka Source rebalancing - 1.14.2

2023-05-09 Thread Madan D via user
Hello Team, We have been using Flink Kafka consumer and recently we have been moving to Flink Kafka source to get more advanced features but we have been observing more rebalances right after data consumed and moving to next operator than Flink Kafka consumer. Can you please let us know what

Re: Certificate renewal for Flink Kafka connectors

2023-04-17 Thread Martijn Visser
Hi Prateek, You will need to stop and restart your jobs with the new connector configuration. Best regards, Martijn On Thu, Apr 13, 2023 at 10:10 AM Prateek Kohli wrote: > Hi, > > I am using Flink Kafka connectors to communicate with Kafka broker over > mutual TLS. > Is t

Certificate renewal for Flink Kafka connectors

2023-04-13 Thread Prateek Kohli
Hi, I am using Flink Kafka connectors to communicate with Kafka broker over mutual TLS. Is there any way or recommendation to handle certificate renewal for these Kafka clients. I am monitoring the pem files and recreating the keystore/truststore(jks) on renewal, but how can I reload these to

Re: Utilizing Kafka headers in Flink Kafka connector

2022-10-13 Thread Shengkai Fang
hi. You can use SQL API to parse or write the header in the Kafka record[1] if you are using Flink SQL. Best, Shengkai [1] https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/#available-metadata Yaroslav Tkachenko 于2022年10月13日周四 02:21写道: > Hi, > > You can implemen

Re: Utilizing Kafka headers in Flink Kafka connector

2022-10-12 Thread Yaroslav Tkachenko
Hi, You can implement a custom KafkaRecordDeserializationSchema (example https://docs.immerok.cloud/docs/cookbook/reading-apache-kafka-headers-with-apache-flink/#the-custom-deserializer) and just avoid emitting the record if the header value matches what you need. On Wed, Oct 12, 2022 at 11:04 AM

Utilizing Kafka headers in Flink Kafka connector

2022-10-12 Thread Great Info
I have some flink applications that read streams from Kafka, now the producer side code has introduced some additional information in Kafka headers while producing records. Now I need to change my consumer-side logic to process the records if the header contains a specific value, if the header valu

Flink Kafka Consumer performance issue

2022-10-02 Thread Xin Ma
Hi, (flink version 1.14.2, kafka version 2.6.1) I have a flink job consuming kafka and simply sinking the data into s3. The kafka consumer is sometimes delayed on a few partitions. The partitions are evenly registered by flink subtasks. I found there was a correlation between kafka consumer fet

Re: Flink kafka producer using the same transactional id twice?

2022-09-05 Thread Alexander Fedulov
> > I then noticed this message showing up twice and thought "this does not > look right": That's fine, this is how the sink works (see the comment here: KafkaWriter.java#L294-L301

Flink kafka producer using the same transactional id twice?

2022-09-05 Thread Sebastian Struss
Hi all, i am quite new to flink and kafka, so i might mix something up here. The situation is that we do have a flink application (1.14.5 with scala 2.12) running for a few hours to days and suddenly it stops working and can't publish to kafka anymore. I then noticed this message showing up twice

Re: (Possible) bug in flink-kafka-connector (metrics rewriting)

2022-08-02 Thread zhanghao.chen
essing. issues.apache.org  Best, Zhanghao Chen From: Valentina Predtechenskaya Sent: Wednesday, August 3, 2022 1:32 To: user@flink.apache.org Subject: (Possible) bug in flink-kafka-connector (metrics rewriting) Hello ! I would like to report a bug with m

(Possible) bug in flink-kafka-connector (metrics rewriting)

2022-08-02 Thread Valentina Predtechenskaya
/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/KafkaWriter.java#L359-L360 I have debugged these libraries a lot and I'm sure in that behavior. If, for example, patch flink-kafka-connector with condition not to initialize metr

Re: Weird Flink Kafka source watermark behavior

2022-04-13 Thread Jin Yi
sue happens even if I use an idle watermark. > >>> > >>> > >>> > >>> You would expect to see glitches with watermarking when you enable > idleness. > >>> > >>> Idleness sort of trades watermark correctness for reduces latency

Re: Weird Flink Kafka source watermark behavior

2022-04-13 Thread Qingsheng Ren
f trades watermark correctness for reduces latency when >>> processing timers (much simplified). >>> >>> With idleness enabled you have no guaranties whatsoever as to the quality >>> of watermarks (which might be ok in some cases). >>> >>> BTW

Re: Weird Flink Kafka source watermark behavior

2022-04-13 Thread Qingsheng Ren
> > enabling idleness would break everything. > > > > > > > > Oversight put aside things should work the way you implemented it. > > > > > > > > One thing I could imagine to be a cause is > > > > • that over time the k

Re: Flink Kafka Issue with EXACTLY_ONCE semantics

2022-04-11 Thread Frank Dekervel
Hello Praneeth, that looks correct then. In that case, maybe somebody else can chime in. Are you aware of this post on SO: https://stackoverflow.com/questions/45047876/apache-kafka-exactly-once-implementation-not-sending-messages Frank On Sat, Apr 9, 2022 at 7:42 PM Praneeth Ramesh wrote: >

Re: Flink Kafka Issue with EXACTLY_ONCE semantics

2022-04-09 Thread Praneeth Ramesh
Hi Frank Thanks for the response. I see that the min.isr value is 2 and replication factor is 4 in my case. Do you see any issues with these values..? Thank you in advance On Fri, Apr 8, 2022 at 11:28 PM Frank Dekervel wrote: > Hello, > > Check if your topic replication factor is not below mi

Re: Flink Kafka Issue with EXACTLY_ONCE semantics

2022-04-08 Thread Frank Dekervel
Hello, Check if your topic replication factor is not below min.isr setting of Kafka. I had the same problem and that was it for me. Frank Op za 9 apr. 2022 04:01 schreef Praneeth Ramesh : > Hi All > > I have a job which reads from kafka and applies some transactions and > writes the data back t

Re: Weird Flink Kafka source watermark behavior

2022-04-08 Thread Martijn Visser
;> - The same issue happens even if I use an idle watermark. >>> > >>> > >>> > >>> > You would expect to see glitches with watermarking when you enable >>> idleness. >>> > >>> > Idleness sort o

Re: Weird Flink Kafka source watermark behavior

2022-04-08 Thread Jin Yi
cy when >> processing timers (much simplified). >> > >> > With idleness enabled you have no guaranties whatsoever as to the >> quality of watermarks (which might be ok in some cases). >> > >> > BTW we dominantly use a mix of fast and slow sources (that only update >> once a da

Re: Weird Flink Kafka source watermark behavior

2022-04-08 Thread Jin Yi
urces (that only update > once a day) which hand-pimped watermarking and late event processing, and > enabling idleness would break everything. > > > > > > > > Oversight put aside things should work the way you implemented it. > > > > > > > > One thing I c

Re: Weird Flink Kafka source watermark behavior

2022-04-08 Thread Qingsheng Ren
consumer subtasks which would probably stress correct recalculation of > watermarks. Hence #partition == number subtask might reduce the problem > • can you enable logging of partition-consumer assignment, to see if > that is the cause of the problem > • also involuntary r

Re: Flink kafka consumer disconnection, application processing stays behind

2022-03-25 Thread Qingsheng Ren
Hi Isidoros, Your watermark strategy looks fine to me. I’m not quite sure if it is related. Best regards, Qingsheng > On Mar 24, 2022, at 21:11, Isidoros Ioannou wrote: > > Hi Qingsheng, > > thank you a lot for you response. > The message I see from the consumer before the log exception I

Re: Flink kafka consumer disconnection, application processing stays behind

2022-03-24 Thread Isidoros Ioannou
Hi Qingsheng, thank you a lot for you response. The message I see from the consumer before the log exception I provided previously is this: "locationInformation": "org.apache.kafka.clients.NetworkClient.handleTimedOutRequests(NetworkClient.java:778)", "logger": "org.apache.kafka.clients.Networ

Re: Flink kafka consumer disconnection, application processing stays behind

2022-03-24 Thread Qingsheng Ren
Hi Isidoros, I’m not sure in which kind of way the timeout and the high back pressure are related, but I think we can try to resolve the request timeout issue first. You can take a look at the request log on Kafka broker and see if the request was received by broker, and how long it takes for b

Flink kafka consumer disconnection, application processing stays behind

2022-03-23 Thread Isidoros Ioannou
Hi, we are running flink 1.13.2 version on Kinesis Analytics. Our source is a kafka topic with one partition so far and we are using the FlinkKafkaConsumer (kafka-connector-1.13.2) Sometimes we get some errors from the consumer like the below: "locationInformation":"org.apache.kafka.clients.FetchS

Re: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Dongwon Kim
ntary restarts of the job can cause havoc as this resets >watermarking > > > > I’ll be off next week, unable to take part in the active discussion … > > > > Sincere greetings > > > > Thias > > > > > > > > > > *From:* Dan Hill > *

RE: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Schwalbe Matthias
Oops mistyped your name, Dan From: Schwalbe Matthias Sent: Freitag, 18. März 2022 09:02 To: 'Dan Hill' ; Dongwon Kim Cc: user Subject: RE: Weird Flink Kafka source watermark behavior Hi San, Dongwon, I share the opinion that when per-partition watermarking is enabled, you shoul

RE: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Schwalbe Matthias
ongwon Kim Cc: user Subject: Re: Weird Flink Kafka source watermark behavior ⚠EXTERNAL MESSAGE – CAUTION: Think Before You Click ⚠ I'll try forcing # source tasks = # partitions tomorrow. Thank you, Dongwon, for all of your help! On Fri, Mar 18, 2022 at 12:20 AM Dongwon Kim mail

Re: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Dan Hill
I'll try forcing # source tasks = # partitions tomorrow. Thank you, Dongwon, for all of your help! On Fri, Mar 18, 2022 at 12:20 AM Dongwon Kim wrote: > I believe your job with per-partition watermarking should be working okay > even in a backfill scenario. > > BTW, is the problem still observe

Re: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Dongwon Kim
I believe your job with per-partition watermarking should be working okay even in a backfill scenario. BTW, is the problem still observed even with # sour tasks = # partitions? For committers: Is there a way to confirm that per-partition watermarking is used in TM log? On Fri, Mar 18, 2022 at 4:

Re: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Dan Hill
I hit this using event processing and no idleness detection. The same issue happens if I enable idleness. My code matches the code example for per-partition watermarking

Re: Weird Flink Kafka source watermark behavior

2022-03-18 Thread Dongwon Kim
Hi Dan, I'm quite confused as you already use per-partition watermarking. What I meant in the reply is - If you don't use per-partition watermarking, # tasks < # partitions can cause the problem for backfill jobs. - If you don't use per-partition watermarking, # tasks = # partitions is going to b

Re: Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dan Hill
Thanks Dongwon! Wow. Yes, I'm using per-partition watermarking [1]. Yes, my # source tasks < # kafka partitions. This should be called out in the docs or the bug should be fixed. On Thu, Mar 17, 2022 at 10:54 PM Dongwon Kim wrote: > Hi Dan, > > Do you use the per-partition watermarking expla

Re: Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dongwon Kim
Hi Dan, Do you use the per-partition watermarking explained in [1]? I've also experienced a similar problem when running backfill jobs specifically when # source tasks < # kafka partitions. - When # source tasks = # kafka partitions, the backfill job works as expected. - When # source tasks < # ka

Re: Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dan Hill
I'm following the example from this section: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/event-time/generating_watermarks/#watermark-strategies-and-the-kafka-connector On Thu, Mar 17, 2022 at 10:26 PM Dan Hill wrote: > Other points > - I'm using the kafka timestamp a

Re: Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dan Hill
Other points - I'm using the kafka timestamp as event time. - The same issue happens even if I use an idle watermark. On Thu, Mar 17, 2022 at 10:17 PM Dan Hill wrote: > There are 12 Kafka partitions (to keep the structure similar to other low > traffic environments). > > On Thu, Mar 17, 2022 at

Re: Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dan Hill
There are 12 Kafka partitions (to keep the structure similar to other low traffic environments). On Thu, Mar 17, 2022 at 10:13 PM Dan Hill wrote: > Hi. > > I'm running a backfill from a kafka topic with very few records spread > across a few days. I'm seeing a case where the records coming from

Weird Flink Kafka source watermark behavior

2022-03-17 Thread Dan Hill
Hi. I'm running a backfill from a kafka topic with very few records spread across a few days. I'm seeing a case where the records coming from a kafka source have a watermark that's more recent (by hours) than the event time. I haven't seen this before when running this. This violates what I'd as

Re: Exception by flink kafka

2021-09-20 Thread Nicolaus Weidner
Hi Ragini, On Fri, Sep 17, 2021 at 1:40 PM Ragini Manjaiah wrote: > Later I started encountering org.apache.kafka.common.errors.TimeoutException: > Failed to update metadata after 6 ms. > This message can have several causes. There may be network issues, your Kafka configuration might be br

Exception by flink kafka

2021-09-17 Thread Ragini Manjaiah
HI, In what scenarios we hit with *java.lang.OutOfMemoryError: Java heap space while publishing to kafka . I hit with this exception and a resolution added property *.setProperty("security.protocol","SSL");in the flink application. Later I started encountering org.apache.kafka.common.errors.Timeou

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-15 Thread Seth Wiesman
I just want to chime in that if you really do need to drop a partition, Flink already supports a solution. If you manually stop the job with a savepoint and restart it with a new UID on the source operator, along with passing the --allowNonRestoredState flag to the client, the source will disregar

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-15 Thread David Morávek
I'll try to be more direct with the answer as you already have the context on what the issue is. When this happens we basically have these options: 1) We can throw an exception (with good wording, so user knows what's going on) and fail the job. This forces user to take an immediate action and fi

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Constantinos Papadopoulos
Thanks David. What you are saying makes sense. But, I keep hearing I shouldn't delete the topic externally, and I keep asking why doesn't Flink forget about the topic IF it has in fact been deleted externally (for whatever reason). I think I will drop this now. On Tue, Sep 14, 2021 at 5:50 PM Dav

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread David Morávek
We are basically describing the same thing with Fabian, just a different wording. The problem is that if you delete the topic externally, you're making an assumption that downstream processor (Flink in this case) has already consumed and RELIABLY processed all of the data from that topic (which ma

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Constantinos Papadopoulos
Hi all, Thank you for the replies, they are much appreciated. I'm sure I'm missing something obvious here, so bear with me... Fabian, regarding: "Flink will try to recover from the previous checkpoint which is invalid by now because the partition is not available anymore." The above would happ

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Jan Lukavský
On 9/14/21 3:57 PM, David Morávek wrote: Hi Jan, Notion of completeness is just one part of the problem. The second part is that once you remove the Kafka topic, you are no longer able to replay the data in case of failure. So you basically need a following workflow to ensure correctness: 1

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread David Morávek
Hi Jan, Notion of completeness is just one part of the problem. The second part is that once you remove the Kafka topic, you are no longer able to replay the data in case of failure. So you basically need a following workflow to ensure correctness: 1) Wait until there are no more elements in the

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Jan Lukavský
Hi, just out of curiosity, would this problem be solvable by the ability to remove partitions, that declare, that do not contain more data (watermark reaching end of global window)? There is probably another problem with that topic can be recreated after being deleted, which could result in w

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Fabian Paul
Hi Constantinos, I agree with David that it is not easily possible to remove a partition while a Flink job is running. Imagine the following scenario: Your Flink job initially works on 2 partitions belonging to two different topics and you have checkpointing enabled to guarantee exactly-once de

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Constantinos Papadopoulos
Thank you for your answer David, which is a confirmation of what we see in the Flink code. A few thoughts below: "as this may easily lead to a data loss" Removing a topic/partition can indeed lead to data loss if not done carefully. However, *after* the topic has been deleted, I believe it woul

Re: Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread David Morávek
Hi Constantinos, The partition discovery doesn't support topic / partition removal as this may easily lead to a data loss (partition removal is not even supported by Kafka for the same reason) Dynamically adding and removing partitions as part of a business logic is just not how Kafka is designed

Flink Kafka Partition Discovery Never Forgets Deleted Partitions

2021-09-14 Thread Constantinos Papadopoulos
We are on Flink 1.12.1, we initialize our FlinkKafkaConsumer with a topic name *pattern*, and we have partition discovery enabled. When our product scales up, it adds new topics. When it scales down, it removes topics. The problem is that the FlinkKafkaConsumer never seems to forget partitions th

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
Robert We are checking using the metric flink_taskmanager_job_task_operator_KafkaConsumer_assigned_partitions{jobname="SPECIFICJOBNAME"} This metric gives the number of partitions assigned to each task(kafka consumer operator). Prasanna. On Wed, Aug 4, 2021 at 8:59 PM Robert Metzger wrote: >

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Robert Metzger
Hi Prasanna, How are you checking the assignment of Kafka partitions to the consumers? The FlinkKafkaConsumer doesn't have a rebalance() method, this is a generic concept of the DataStream API. Is it possible that you are somehow partitioning your data in your Flink job, and this is causing the d

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
Robert When we apply a rebalance method to the kafka consumer, it is assigning partitions of various topics evenly. But my only concern is that the rebalance method might have a performance impact . Thanks, Prasanna. On Wed, Aug 4, 2021 at 5:55 PM Prasanna kumar wrote: > Robert, > > Flink ve

Re: Topic assignment across Flink Kafka Consumer

2021-08-04 Thread Prasanna kumar
Robert, Flink version 1.12.2. Flink connector Kafka Version 2..12 The partitions are assigned equally if we are reading from a single topic. Our Use case is to read from multiple topics [topics r4 regex pattern] we use 6 topics and 1 partition per topic for this job. In this case , few of the k

Re: Topic assignment across Flink Kafka Consumer

2021-07-20 Thread Robert Metzger
Hi Prasanna, which Flink version and Kafka connector are you using? (the "KafkaSource" or "FlinkKafkaConsumer"?) The partition assignment for the FlinkKafkaConsumer is defined here: https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/st

Topic assignment across Flink Kafka Consumer

2021-07-19 Thread Prasanna kumar
Hi, We have a Flink job reading from multiple Kafka topics based on a regex pattern. What we have found out is that the topics are not shared between the kafka consumers in an even manner . Example if there are 8 topics and 4 kafka consumer operators . 1 consumer is assigned 6 topics , 2 consume

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
2021 5:23 PM To: Taimoor Bhatti ; user@flink.apache.org Subject: Re: Apache Flink Kafka Connector not found Error Hi Taimoor, It seems sometime IntelliJ does not works well for index, perhaps you could choose mvn -> reimport project from the context menu, if it still not work, perhaps you mi

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Yun Gao
Yun -- From:Taimoor Bhatti Send Time:2021 Jul. 19 (Mon.) 23:03 To:user@flink.apache.org ; Yun Gao Subject:Re: Apache Flink Kafka Connector not found Error Hello Yun, Many thanks for the reply... For some reason I'm not able t

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
mport org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer "mvn clean compile" works however... (Thanks...) Do you know why IntelliJ doesn't see the import?? Best, Taimoor From: Yun Gao Sent: Monday, July 19, 2021 3:25 PM To: Taimoor Bhatti ; user@flink.apache.org Subject: Re: Apache F

Re: Apache Flink Kafka Connector not found Error

2021-07-19 Thread Yun Gao
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer Best, Yun -- From:Taimoor Bhatti Send Time:2021 Jul. 19 (Mon.) 18:43 To:user@flink.apache.org Subject:Apache Flink Kafka Connector not found Error I'm having some trouble with

Apache Flink Kafka Connector not found Error

2021-07-19 Thread Taimoor Bhatti
I'm having some trouble with using the Flink DataStream API with the Kafka Connector. There don't seem to be great resources on the internet which can explain the issue I'm having. My project is here: https://github.com/sysarcher/flink-scala-tests I want to I'm unable to use FlinkKafkaConsumer

Re: How to register custormize serializer for flink kafka format type

2021-07-13 Thread Piotr Nowojski
Hi, It's mentioned in the docs [1], but unfortunately this is not very well documented in 1.10. In short you have to provide a custom implementation of a `DeserializationSchemaFactory`. Please look at the built-in factories for examples of how it can be done. In newer versions it's both easier an

How to register custormize serializer for flink kafka format type

2021-07-08 Thread Chenzhiyuan(HR)
I create table as below, and the data is from kafka. I want to deserialize the json message to Pojo object. But the message format is not avro or simple json. So I need to know how to register custormized serializer and use it for the 'format.type' property. By the way, my flink version is 1.10.0.

Re: Flink + Kafka Dynamic Table

2021-07-07 Thread Arvid Heise
> is table E. When record a1 in table A changes to a1’, corresponding > results in table D and E should also be changed. > > > I consider to use flink + kafka dynamic table to solve this. But > there’s a potential problem: > > > - records in medical field must not be lost. >

Flink + Kafka Dynamic Table

2021-07-05 Thread vtygoss
table E. When record a1 in table A changes to a1’, corresponding results in table D and E should also be changed. I consider to use flink + kafka dynamic table to solve this. But there’s a potential problem: - records in medical field must not be lost. - kafka event is append-only, and kafka

  1   2   3   4   5   >