Hi Gordon,
Thanks for your feedback as always.
Why not just Map?
I think it makes sense to relocate the bootstrapServer field in the Kafka
properties (ClusterMetadata abstraction), since that is intuitive based
on how it is defined in the Kafka clients library. It also makes the
uniqueness of a
Hi Mason,
Thanks for addressing my comments. I agree that option 3 seems more
reasonable.
> Reorganize the metadata in a Map in
`KafkaStream` where the String is the proposed
`KafkaClusterIdentifier.name` field.
Why not just Map?
Regarding naming, I like DynamicKafkaSource as that's what I imme
>
> My main worry for doing this as a later iteration is that this would
> probably be a breaking change for the public interface. If that can be
> avoided and planned ahead, I'm fine with moving forward with how it is
> right now.
Make sense. Considering the public interfaces, I think we still w
Hi Mason,
Thanks for the iterations on the FLIP, I think this is in a very good shape
now.
Small correction for the MultiClusterKafkaSourceEnumerator section: "This
reader is responsible for discovering and assigning splits from 1+ cluster"
Regarding the user facing name of the connector: I agre
> Regarding (2), definitely. This is something we planned to add later on
but
so far keeping things common has been working well.
My main worry for doing this as a later iteration is that this would
probably be a breaking change for the public interface. If that can be
avoided and planned ahead, I
Hi Gordon,
Thanks for taking a look!
Regarding (1), there is a need from the readers to send this event at
startup because the reader state may reflect outdated metadata. Thus, the
reader should not start without fresh metadata. With fresh metadata, the
reader can filter splits from state--this f
Hi Mason,
Thanks for updating the FLIP. In principle, I believe this would be a
useful addition. Some comments so far:
1. In this sequence diagram [1], why is there a need for a
GetMetadataUpdateEvent from the MultiClusterSourceReader going to the
MultiClusterSourceEnumerator? Shouldn't the enume
Hi Jing,
Thanks for the prompt feedback! I had some confusion with how to resize
images in confluence--anyways, I have made the font bigger, added white
background, and also made the diagrams themselves bigger.
Regarding the exactly once semantics, that's definitely good to point out
in the doc.
Hi Mason,
It is a very practical feature that many users are keen to use. Thanks to
the previous discussion, the FLIP now looks informative. Thanks for your
proposal. One small suggestion is that the attached images are quite small
to read if we don't click and enlarge them. Besides that, It is di
Hi all,
I'm working on FLIP-246 again, for the Multi Cluster Kafka Source
contribution. The document has been updated with some more context about
how it can solve the Kafka topic removal scenario and a sequence diagram to
illustrate how the components interact.
Looking forward to any feedback!
Hi Ryan,
Thanks for the additional context! Yes, the offset initializer would need
to take a cluster as a parameter and the MultiClusterKafkaSourceSplit can
be exposed in an initializer.
Best,
Mason
On Thu, Oct 6, 2022 at 11:00 AM Ryan van Huuksloot <
ryan.vanhuuksl...@shopify.com> wrote:
> Hi
Hi Mason,
Thanks for the clarification! In regards to the addition to the
OffsetInitializer of this API - this would be an awesome addition and I
think this entire FLIP would be a great addition to the Flink.
To provide more context as to why we need particular offsets, we use Hybrid
Source to cu
Hi Ryan,
Just copying your message over to the email chain.
Hi Mason,
> First off, thanks for putting this FLIP together! Sorry for the delay.
> Full disclosure Mason and I chatted a little bit at Flink Forward 2022 but
> I have tried to capture the questions I had for him then.
> I'll start the
Hi Mason,
First off, thanks for putting this FLIP together! Sorry for the delay. Full
disclosure Mason and I chatted a little bit at Flink Forward 2022 but I
have tried to capture the questions I had for him then.
I'll start the conversation with a few questions:
1. The concept of streamIds is n
Hi Max,
Thanks for taking a look!
> I'm wondering whether we can share some of the code of the existing
> KafkaSource.
>
That is the intention--let me call it out more explicitly.
Regarding your questions:
1. Indeed, the KafkaMetadataService has the describe stream method to get a
particular
Hey Mason,
I just had a look at the FLIP. If I understand correctly, you are
proposing a very sophisticated way to read from multiple Kafka clusters
/ topics.
I'm wondering whether we can share some of the code of the existing
KafkaSource. I suppose you don't want to modify KafkaSource itsel
5. At startup, GetMetadataUpdateEvent is also used to allow the
MultiClusterKafkaSourceReader to get the latest metadata from the
enumerator to filter out invalid splits This is how the reader can solve
"removing" splits/topics in the startup case.
Sorry for the late response, really appreciate yo
Hi Qingsheng,
Thanks for the feedback--these are great points to raise.
1. This is something I missed that is now added. More generally, it can
locate multiple topics in multiple clusters (1 topic on 1 cluster is the
simplest case).
2. The KafkaMetadataService doesn't interact with the KafkaAdmi
Hi Mason,
Thank you for starting this FLIP!
From my first glance this FLIP looks like a collection of many new interfaces,
but I can’t stitch them together. It’ll be great to have some brief
descriptions about how the source works internally. Here are some questions in
my mind and please cor
Hi all,
We would like to start a discussion thread on FLIP-246: Multi Cluster Kafka
Source [1] where we propose to provide a source connector for dynamically
reading from Kafka multiple clusters, which will not require Flink job
restart. This can greatly improve the Kafka migration experience for
20 matches
Mail list logo