Thanks Becket,

Sorry for delayed response. That’s what I thought as well. I built a hacky 
custom source today directly using Kafka client which was able to join consumer 
group etc. which works as I expected but not sure about production readiness 
for something like that :)

The need arises because of (1) Business continuity needs (2) Some of the 
pipelines we are building are close to network edge and need to run on nodes 
where we are not allowed to create cluster (yea - let’s not get into that can 
of security related worms :)). We will get there at some point but for now we 
are trying to support business continuity on those edge nodes by not actually 
forming a cluster but using “walled garden” individual Flink server. I fully 
understand this is not ideal. And all of this started because some of the work 
we were doing with Logstash needed to be migrated out as Logstash wasn’t able 
to keep up with data rates unless we put some ridiculous number of servers. In 
essence, we have pre-approved constraints to connect to Kafka and southbound 
interfaces using Logstash, which we need to replace for some datasets as they 
are massive for Logstash to keep up with. 

Hope that explains a bit where our head is at.

Thanks, Ashish 

> On Aug 29, 2019, at 11:40 AM, Becket Qin <becket....@gmail.com> wrote:
> 
> Hi Ashish,
> 
> You are right. Flink does not use Kafka based group management. So if you 
> have two clusters consuming the same topic, they will not divide the 
> partitions. The cross cluster HA is not quite possible at this point. It 
> would be good to know the reason you want to have such HA and see if Flink 
> meets you requirement in another way.
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> On Thu, Aug 29, 2019 at 9:19 PM ashish pok <ashish...@yahoo.com 
> <mailto:ashish...@yahoo.com>> wrote:
> Looks like Flink is using “assign” partitions instead of “subscribe” which 
> will not allow participating in a group if I read the code correctly. 
> 
> Has anyone solved this type of problem in past of active-active HA across 2 
> clusters using Kafka? 
> 
> 
> - Ashish
> On Wednesday, August 28, 2019, 6:52 PM, ashish pok <ashish...@yahoo.com 
> <mailto:ashish...@yahoo.com>> wrote:
> 
> All,
> 
> I was wondering what the expected default behavior is when same app is 
> deployed in 2 separate clusters but with same group Id. In theory idea was to 
> create active-active across separate clusters but it seems like both apps are 
> getting all the data from Kafka. 
> 
> Anyone else has tried something similar or have an insight on expected 
> behavior? I was expecting to see partial data on both apps and to get all 
> data in one app if other was turned off.
> 
> Thanks in advance,
> 
> - Ashish

Reply via email to