Re: Re: KafkaSource consumer group

2023-03-31 Thread Andrew Otto
Hi, FWIW, I asked a similar question here: https://lists.apache.org/thread/1f01zo1lqcmhvosptpjlm6k3mgx0sv1m :) On Fri, Mar 31, 2023 at 3:57 AM Roberts, Ben (Senior Developer) via user < user@flink.apache.org> wrote: > Hi Gordon, > > Thanks for the reply! > I think that makes sense. > > The rea

RE: Re: KafkaSource consumer group

2023-03-31 Thread Roberts, Ben (Senior Developer) via user
Hi Gordon, Thanks for the reply! I think that makes sense. The reason for investigating is that generally we run our production workloads across 2 kubernetes clusters (each in a different cloud region) for availability reasons. So for instance requests to web apps are load balanced between ser

Re: KafkaSource consumer group

2023-03-30 Thread Tzu-Li (Gordon) Tai
Sorry, I meant to say "Hi Ben" :-) On Thu, Mar 30, 2023 at 9:52 AM Tzu-Li (Gordon) Tai wrote: > Hi Robert, > > This is a design choice. Flink's KafkaSource doesn't rely on consumer > groups for assigning partitions / rebalancing / offset tracking. It > manually assigns whatever partitions are in

Re: KafkaSource consumer group

2023-03-30 Thread Tzu-Li (Gordon) Tai
Hi Robert, This is a design choice. Flink's KafkaSource doesn't rely on consumer groups for assigning partitions / rebalancing / offset tracking. It manually assigns whatever partitions are in the specified topic across its consumer instances, and rebalances only when the Flink job / KafkaSink is

Re: KafkaSource and Event Time in Message Payload

2022-12-15 Thread Niklas Wilcke
Hi Martijn, thanks for referencing both related FLIPs and providing a recommendation. That was already helpful. I need to find some time to further investigate this topic. So far I agree that it might be the most reasonable approach to just use the Kafka timestamp to carry the event time. Than

Re: KafkaSource and Event Time in Message Payload

2022-12-13 Thread Martijn Visser
Hi Niklas, On your confirmations: a1) The default behaviour is documented at https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#event-time-and-watermarks - Flink uses the timestamp embedded in the Kafka ConsumerRecord. a2) I'm not 100% sure: since Flink 1.15, t

Re: KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-12 Thread Mason Chen
Hi Andrew and Martijn, Thanks for looping me in, this is an interesting discussion! I'm trying to solve a higher level problem about Kafka topic routing/assignment with FLIP-246. The main idea is that there can exist an external service that can provide the coordination between Kafka and Flink to

Re: KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-05 Thread Martijn Visser
Hi Andrew, While definitely no expert on this topic, my first thought was if this idea could be solved with the idea that was proposed in FLIP-246 https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source I'm also looping in Mason Chen who was the initiator of that

Re: KafkaSource, consumer assignment, and Kafka ‘stretch’ clusters for multi-DC streaming apps

2022-10-05 Thread Andrew Otto
(Ah, note that I am considering very simple streaming apps here, e.g. event enrichment apps. No windowing or complex state. The only state is the Kafka offsets, which I suppose would also have to be managed from Kafka, not from Flink state.) On Wed, Oct 5, 2022 at 9:54 AM Andrew Otto wrote:

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread David Anderson
Before Kafka introduced their universal client, Flink had version-specific connectors, e.g., for versions 0.8, 0.9, 0.10, and 0.11. Those were eventually removed in favor of FlinkKafkaConsumer, which is/was backward compatible back to Kafka version 0.10. FlinkKafkaConsumer itself was deprecated in

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread Francesco Guardiani
I think the FlinkKakfaConsumer010 you're talking about is the old source api. You should use only KafkaSource now, as they use the new source infrastructure. On Tue, Feb 1, 2022 at 9:02 AM HG wrote: > Hello Francesco > Perhaps I copied the wrong link of 1.2. > But there is also > https://nightli

Re: KafkaSource vs FlinkKafkaConsumer010

2022-02-01 Thread HG
Hello Francesco Perhaps I copied the wrong link of 1.2. But there is also https://nightlies.apache.org/flink/flink-docs-release-1.4/api/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumer010.html It seems there are 2 ways to use Kafka KafkaSource source = KafkaSource.builder()

Re: KafkaSource vs FlinkKafkaConsumer010

2022-01-31 Thread Francesco Guardiani
The latter link you posted refers to a very old flink release. You shold use the first link, which refers to latest release FG On Tue, Feb 1, 2022 at 8:20 AM HG wrote: > Hello all > > I am confused. > What is the difference between KafkaSource as defined in : > https://nightlies.apache.org/flin

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Lars Skjærven
Thanks for the feedback. > May I ask why you have less partitions than the parallelism? I would be happy to learn more about your use-case to better understand the > motivation. The use case is that topic A, contains just a few messages with product metadata that rarely gets updated, while topic

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-16 Thread Fabian Paul
Hi all, The problem you are seeing Lars is somewhat intended behaviour, unfortunately. With the batch/stream unification every Kafka partition is treated as kind of workload assignment. If one subtask receives a signal that there is no workload anymore it goes into the FINISHED state. As alread

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread David Morávek
If we are shutting down any sources of unbounded jobs that run on Flink versions without FLIP-147 (available in 1.14) [1], that Matthias has mentioned, than it's IMO a bug, because it effectively breaks checkpointing. Fabian, can you please verify whether this is an intended behavior? In the meant

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Lars Skjærven
Got it. So the workaround for now (1.13.2) is to fall back to FlinkKafkaConsumer if I read you correctly. Thanks L On Wed, Sep 15, 2021 at 2:58 PM Matthias Pohl wrote: > Hi Lars, > I guess you are looking > for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. > This configurat

Re: KafkaSource builder and checkpointing with parallelism > kafka partitions

2021-09-15 Thread Matthias Pohl
Hi Lars, I guess you are looking for execution.checkpointing.checkpoints-after-tasks-finish.enabled [1]. This configuration parameter is going to be introduced in the upcoming Flink 1.14 release. Best, Matthias [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#execu

Re: KafkaSource

2021-05-28 Thread Matthias Pohl
Hi Alexey, the two implementations are not compatible. You can find information on how to work around this in the Kafka Connector docs [1]. Best, Matthias [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/datastream/kafka/#upgrading-to-the-latest-connector-version

Re: KafkaSource metrics

2021-05-26 Thread Alexey Trenikhun
Found https://issues.apache.org/jira/browse/FLINK-22766 From: Alexey Trenikhun Sent: Tuesday, May 25, 2021 3:25 PM To: Ardhani Narasimha ; 陳樺威 ; Flink User Mail List Subject: Re: KafkaSource metrics Looks like when KafkaSource is used instead of

Re: KafkaSource metrics

2021-05-25 Thread Alexey Trenikhun
Looks like when KafkaSource is used instead of FlinkKafkaConsumer, metrics listed below are not available. Bug? Work in progress? Thanks, Alexey From: Ardhani Narasimha Sent: Monday, May 24, 2021 9:08 AM To: 陳樺威 Cc: user Subject: Re: KafkaSource metrics Use

Re: KafkaSource metrics

2021-05-25 Thread Qingsheng Ren
Hi Oscar, Thanks for raising this problem! Currently metrics of KafkaConsumer are not registered in Flink as in FlinkKafkaConsumer. A ticket has been created on JIRA, and hopefully we can fix it in next release. https://issues.apache.org/jira/browse/FLINK-22766 -- Best Regards, Qingsheng Ren

Re: KafkaSource metrics

2021-05-24 Thread 陳樺威
Hi Ardhani, Thanks for your kindly reply. Our team use your provided metrics before, but the metrics disappear after migrate to new KafkaSource. We initialize KafkaSource in following code. ``` val consumer: KafkaSource[T] = KafkaSource.builder() .setProperties(properties) .setTopics(topic)

Re: KafkaSource metrics

2021-05-24 Thread Ardhani Narasimha
Use below respectively flink_taskmanager_job_task_operator_KafkaConsumer_bytes_consumed_rate - Consumer rate flink_taskmanager_job_task_operator_KafkaConsumer_records_lag_max - Consumer lag flink_taskmanager_job_task_operator_KafkaConsumer_commit_latency_max - commit latency unsure if reactive mo

Re: KafkaSource Problem

2021-03-10 Thread Till Rohrmann
Great to hear. Yes, if you can help fix this issue that would be great. Cheers, Till On Tue, Mar 9, 2021 at 3:41 PM Bobby Richard wrote: > Great thanks, I was able to work around the issue by implementing my own > KafkaRecordDeserializer. I will take a stab at a PR to fix the bug, should > be a

Re: KafkaSource Problem

2021-03-09 Thread Bobby Richard
Great thanks, I was able to work around the issue by implementing my own KafkaRecordDeserializer. I will take a stab at a PR to fix the bug, should be an easy fix. On Tue, Mar 9, 2021 at 9:26 AM Till Rohrmann wrote: > Hi Bobby, > > This is most likely a bug in Flink. Thanks a lot for reporting t

Re: KafkaSource Problem

2021-03-09 Thread Till Rohrmann
Hi Bobby, This is most likely a bug in Flink. Thanks a lot for reporting the issue and analyzing it. I have created an issue for tracking it [1]. cc Becket. [1] https://issues.apache.org/jira/browse/FLINK-21691 Cheers, Till On Mon, Mar 8, 2021 at 3:35 PM Bobby Richard wrote: > I'm receiving