Re: [DISCUSS] Release Flink 1.17.1

2023-05-16 Thread Kevin Lam
Hi! Thanks for doing this release. I'm looking forward to some of the bug fixes, is there a date set for the release of 1.17.1? On Mon, May 15, 2023 at 6:10 AM Lijie Wang wrote: > +1 for the release. > > Best, > Lijie > > Jing Ge 于2023年5月15日周一 17:07写道: > > > +1 for releasing 1.17.1 > > > > Best

Zero-Downtime Deployments with Flink Operator

2023-05-24 Thread Kevin Lam
Hi, Is there any interest or ongoing work around supporting zero-downtime deployments with Flink using the Flink Operator? I saw that https://issues.apache.org/jira/browse/FLINK-24257 existed, but it looks a little stale. I'm interested in learning more about the current state of things. There i

Re: Zero-Downtime Deployments with Flink Operator

2023-05-24 Thread Kevin Lam
ourse I > can be convinced otherwise if there is a general requirement / interest in > the community. In any case we should have confidence that this will > actually provide production value to many use-cases and it would require a > FLIP for sure. > > Cheers, > Gyula > > &

Flink Operator - Supporting Recovery from Snapshot

2023-02-07 Thread Kevin Lam
Hello, I was reading the Flink Kubernetes Operator documentation and noticed that if you want to redeploy a Flink job from a specific snapshot, you must follow these manual recovery steps. Are there plans to streamline this process? Deploying from a specific snapshot is a relatively common operati

Re: Flink Operator - Supporting Recovery from Snapshot

2023-02-10 Thread Kevin Lam
hen the operator > loses track of the latest checkpoint, mostly due to "incorrect" error > handling on the Flink side that also deletes the HA metadata. I think we > should strive to improve and eliminate most of these cases (as we have > already done for many of these problems)

Re: Flink Operator - Supporting Recovery from Snapshot

2023-02-10 Thread Kevin Lam
t; > This approach is indeed irreversible, but so far it's been working well. > > On Fri, Feb 10, 2023 at 8:17 AM Kevin Lam > wrote: > > > Thanks for the response Gyula! Those caveats make sense, and I see, > there's > > a bit of a complexity to consider if

Impact of redacting UPDATE_BEFORE fields?

2024-02-07 Thread Kevin Lam
Hi there! I have a question about Changelog Stream Processing with Flink SQL and the Flink Table API. I would like to better understand how UPDATE_BEFORE fields are used by Flink. Our team uses Debezium to extract Change Data Capture events from MySQL databases. We currently redact the `before` f

Re: Impact of redacting UPDATE_BEFORE fields?

2024-02-08 Thread Kevin Lam
ectly support > retractions (in case of updates and deletes). > > Great talk on this topic: > https://www.youtube.com/watch?v=iRlLaY-P6iE&ab_channel=PlainSchwarz (the > middle part is the most relevant). > > > On Wed, Feb 7, 2024 at 12:13 PM Kevin Lam > wrote: > >

[DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-02-21 Thread Kevin Lam
I would love to get some feedback from the community on this JIRA issue: https://issues.apache.org/jira/projects/FLINK/issues/FLINK-34440 I am looking into creating a PR and would appreciate some review on the approach. In terms of design I think we can mirror the `debezium-avro-confluent` and `a

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-02-27 Thread Kevin Lam
ke sense to have the Confluent schema lookup in common > > code, which is part of the SchemaCoder readSchema logic. > > * I assume the ProtobufSchemaCoder readSchema would return a Protobuf > > Schema object. > > > > > > > > I also wondered whet

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-02-29 Thread Kevin Lam
o that you can start reviewing and maybe also contributing to the > PR. > I hope this timeline works for you! > > Let's also decide if we need a FLIP once the code is public. > We will look into the field ids. > > > On Tue, Feb 27, 2024 at 8:56 PM Kevin Lam > w

Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-06 Thread Kevin Lam
Hi there, We use the Flink Kubernetes Operator, and I am investigating how we can easily support failing over a FlinkDeployment from one Kubernetes Cluster to another in the case of an outage that requires us to migrate a large number of FlinkDeployments from one K8s cluster to another. I underst

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-06 Thread Kevin Lam
Another thought could be modifying the operator to have a behaviour where upon first deploy, it optionally (flag/param enabled) finds the most recent snapshot and uses that as the initialSavepointPath to restore and run the Flink job. On Wed, Mar 6, 2024 at 2:07 PM Kevin Lam wrote: > Hi th

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-03-12 Thread Kevin Lam
s/protobuf/AbstractKafkaProtobufSerializer.java#L157 > > > (|Deserializer). > Please let me know if this makes sense / or in case you have any other > feedback. > > Thanks > Anupam > > On Thu, Feb 29, 2024 at 8:54 PM Kevin Lam > wrote: > > > Hey Robert, >

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-13 Thread Kevin Lam
e. > > Probably something worth trying :) > > -Max > > > > On Wed, Mar 6, 2024 at 9:09 PM Kevin Lam > wrote: > > > > Another thought could be modifying the operator to have a behaviour where > > upon first deploy, it optionally (flag/param enabled) finds

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-14 Thread Kevin Lam
e of course many many others. > > There may come a time when the Flink community decides to take on such a > scope but it feels a bit too much at this point to try to standardize this. > > Cheers, > Gyula > > On Wed, Mar 13, 2024 at 9:18 PM Kevin Lam > wrote: > >

Efficient Use of Disk Resources (Only Attach Disks to Stateful Task Managers)

2024-03-14 Thread Kevin Lam
Hi all, I was wondering if anyone has any ideas or advice when it comes to being efficient about the use of disks with the RocksDB StateBackend. In general not all operators in a Flink Job will be stateful and require persistent disks to use with RocksDB, and so any stateless operators running on

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-21 Thread Kevin Lam
No worries, thanks for the reply Gyula. Ah yes, I see how those points you raised make the feature tricky to implement. Could this be considered for a FLIP (or two) in the future? On Wed, Mar 20, 2024 at 2:21 PM Gyula Fóra wrote: > Sorry for the late reply Kevin. > > I think what you are sugges

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-22 Thread Kevin Lam
Thanks for sharing this work Gyula! That's great to see the FLIP covers some of the limitations already. I will follow the FLIP and associated JIRA ticket. Hi Matthias Pohl. I'd be interested to learn if there has been any progress on the FLIP-360 or associated JIRA issue FLINK-31709. On Fri, Mar

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-03-26 Thread Kevin Lam
f FlinkToProtoSchemaConverter we might > end up overwriting the field-Ids. > If we are able to locate a prior schema, the approach you outlined makes a > lot of sense. > Let me explore this a bit further and get back(in terms of feasibility). > > Thanks again! > - Anupam > >

Best Practices? Fault Isolation for Processing Large Number of Same-Shaped Input Kafka Topics in a Big Flink Job

2024-05-08 Thread Kevin Lam
Hi everyone, I'm currently prototyping on a project where we need to process a large number of Kafka input topics (say, a couple of hundred), all of which share the same DataType/Schema. Our objective is to run the same Flink SQL on all of the input topics, but I am concerned about doing this in

Poor Load Balancing across TaskManagers for Multiple Kafka Sources

2024-06-05 Thread Kevin Lam
Hey all, I'm seeing an issue with poor load balancing across TaskManagers for Kafka Sources using the Flink SQL API and wondering if FLIP-370 will help with it, or if not, interested in any ideas the community has to mitigate the issue. The Kafka SplitEnumerator uses the following logic to assign

Re: Poor Load Balancing across TaskManagers for Multiple Kafka Sources

2024-06-05 Thread Kevin Lam
cc. panyuep...@apache.org as related to FLIP-370 On Wed, Jun 5, 2024 at 2:32 PM Kevin Lam wrote: > Hey all, > > I'm seeing an issue with poor load balancing across TaskManagers for Kafka > Sources using the Flink SQL API and wondering if FLIP-370 will help with > it, or i

Re: Poor Load Balancing across TaskManagers for Multiple Kafka Sources

2024-06-05 Thread Kevin Lam
I've also just found https://issues.apache.org/jira/browse/FLINK-31762 which tracks the Kafka specific issue. On Wed, Jun 5, 2024 at 3:05 PM Kevin Lam wrote: > cc. panyuep...@apache.org as related to FLIP-370 > > On Wed, Jun 5, 2024 at 2:32 PM Kevin Lam wrote: > >> Hey a

Re: Poor Load Balancing across TaskManagers for Multiple Kafka Sources

2024-06-06 Thread Kevin Lam
here. > > Best, > Zhanghao Chen > ________ > From: Kevin Lam > Sent: Thursday, June 6, 2024 2:32 > To: dev@flink.apache.org > Subject: Poor Load Balancing across TaskManagers for Multiple Kafka Sources > > Hey all, > > I'm seeing an issue with poor load bala

RE: [VOTE] FLIP-265 Deprecate and remove Scala API support

2022-10-19 Thread Kevin Lam
Non-binding -1. Aligned with the objections raised by Yaroslav in https://lists.apache.org/thread/d3borhdzj496nnggohq42fyb6zkwob3h and agree with his recommendation to support Java 17 as a prerequisite. We use the Scala APIs extensively and we would like to have an alternative to Scala so that we

Potential Kafka Connector FLIP: Large Message Handling

2024-07-05 Thread Kevin Lam
Hi all, Writing to see if the community would be open to exploring a FLIP for the Kafka Table Connectors. The FLIP would allow for storing Kafka Messages beyond a Kafka cluster's message limit (1 MB by default) out of band in cloud object storage or another backend. During serialization the messa

Re: Potential Kafka Connector FLIP: Large Message Handling

2024-07-08 Thread Kevin Lam
same time. > > > > It's not totally clear how Conduktor solved these issues, but IMO they > are > > worth keeping in mind. For Kroxylicious we decided these problems meant > it > > wasn't practical for us to implement this, but I'd be curious to know if >

Re: Potential Kafka Connector FLIP: Large Message Handling

2024-07-08 Thread Kevin Lam
/apache/flink/connector/kafka/source/reader/deserializer/KafkaRecordDeserializationSchema.java#L107 > [3] > > https://github.com/bakdata/kafka-large-message-serde/blob/09eae933afaf8a1970b1b1bebcdffe934c368cb9/large-message-serde/src/main/java/com/bakdata/kafka/LargeMessageDeserializ

Re: [DISCUSS] FLIP-XXX Apicurio-avro format

2024-07-08 Thread Kevin Lam
Hi David, Any updates on the Kafka Message Header support? I am also interested in supporting headers with the Flink SQL Formats: https://lists.apache.org/thread/spl88o63sjm2dv4l5no0ym632d2yt2o6 On Fri, Jun 14, 2024 at 6:10 AM David Radley wrote: > Hi everyone, > I have talked with Chesnay and

Re: Potential Kafka Connector FLIP: Large Message Handling

2024-07-10 Thread Kevin Lam
23c216/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/KafkaSourceBuilder.java#L464> to make it overridable. I'm planning to start a separate thread to propose making `value.serializer` overridable. On Mon, Jul 8, 2024 at 11:18 AM Kevin Lam wrote: > Hi Fa

Re: Potential Kafka Connector FLIP: Large Message Handling

2024-07-10 Thread Kevin Lam
the external > storage lives and authentication. Limitations around stack and heap sizes > would be worth considering. > > Am I understanding your intent correctly? > Kind regards, David. > > > From: Kevin Lam > Date: Wednesday, 10 July 2024 at 14:35 > T

Re: [DISCUSS] FLINK-34440 Support Debezium Protobuf Confluent Format

2024-07-19 Thread Kevin Lam
t; In general, I think it would be good to be more explicit about the > > > schemas > > > > used ( > > > > > > > > > > > > > > https://docs.confluent.io/platform/curren/schema-registry/schema_registry_onprem_tutorial.html#auto-schema-r

Re: Potential Kafka Connector FLIP: Large Message Handling

2024-07-19 Thread Kevin Lam
ts then taken from external source and > put into the Kafka body or the other way round for serialization. This is > different to my Apicurio work that is to handle format specific headers. > > I assume you would want to add configuration to define where the external > storage l

[FLINK-22748] Update for "Allow dynamic target topic selection in SQL Kafka sinks"

2024-07-23 Thread Kevin Lam
Hi there, I've spent some time picking up the existing work for FLINK-22748 , and created a PR, building on Nicholas Jiang's work: https://github.com/apache/flink-connector-kafka/pull/109 I saw the issue is marked stale, so I'm starting this mai

[Flink SQL Format] How to Process Metadata Columns in KafkaDynamicSink compatible EncodingFormat

2024-08-07 Thread Kevin Lam
Hi there, I'm looking to add some functionality to a custom Format that reads the `topic` metadata column in the context of serialization using the Flink SQL Kafka Connector. I'm c

[jira] [Created] (FLINK-34440) Support Debezium Protobuf Confluent Format

2024-02-13 Thread Kevin Lam (Jira)
Kevin Lam created FLINK-34440: - Summary: Support Debezium Protobuf Confluent Format Key: FLINK-34440 URL: https://issues.apache.org/jira/browse/FLINK-34440 Project: Flink Issue Type: New Feature

[jira] [Created] (FLINK-35808) Let ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG be overridable by user in KafkaSourceBuilder

2024-07-10 Thread Kevin Lam (Jira)
Kevin Lam created FLINK-35808: - Summary: Let ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG be overridable by user in KafkaSourceBuilder Key: FLINK-35808 URL: https://issues.apache.org/jira/browse/FLINK-35808

[jira] [Created] (FLINK-36017) Support Passing Metadata to Formats in KafkaDynamicSink

2024-08-08 Thread Kevin Lam (Jira)
Kevin Lam created FLINK-36017: - Summary: Support Passing Metadata to Formats in KafkaDynamicSink Key: FLINK-36017 URL: https://issues.apache.org/jira/browse/FLINK-36017 Project: Flink Issue Type

[jira] [Created] (FLINK-37327) Debezium Avro Format: Add FormatOption to Optionally Skip emitting UPDATE_BEFORE Rows

2025-02-14 Thread Kevin Lam (Jira)
Kevin Lam created FLINK-37327: - Summary: Debezium Avro Format: Add FormatOption to Optionally Skip emitting UPDATE_BEFORE Rows Key: FLINK-37327 URL: https://issues.apache.org/jira/browse/FLINK-37327