Re: PyFlink kafka producer topic override

2021-06-24 Thread Curt Buechter
Thanks. I will reconsider my architecture. On Thu, Jun 24, 2021 at 1:37 AM Arvid Heise wrote: > Hi, > > getTargetTopic can really be used as Curt needs it. So it would be good to > add to PyFlink as well. > > However, I'm a bit skeptical that Kafka can really handle that model well. > It's usual

Re: PyFlink kafka producer topic override

2021-06-24 Thread Arvid Heise
Hi Curt, Upon rechecking the code, you actually don't set the topic through KafkaContextAware but just directly on the KafkaRecord returned by the KafkaSerializationSchema. Sorry for the confusion Arvid On Thu, Jun 24, 2021 at 8:36 AM Arvid Heise wrote: > Hi, > > getTargetTopic can really be

Re: PyFlink kafka producer topic override

2021-06-23 Thread Arvid Heise
Hi, getTargetTopic can really be used as Curt needs it. So it would be good to add to PyFlink as well. However, I'm a bit skeptical that Kafka can really handle that model well. It's usually encouraged to use rather fewer, larger topics and you'd rather use partitions here instead of topics. Ther

Re: PyFlink kafka producer topic override

2021-06-23 Thread Dian Fu
OK, got it. Then it seems that split streams is also not quite suitable to address your requirements as you still need to iterate over each of the side output corresponding to each topic. Regarding to getTargetTopic [1], per my understanding, it’s not designed to dynamically assign the topic f

Re: PyFlink kafka producer topic override

2021-06-23 Thread Curt Buechter
Hi Dian, Thanks for the reply. I don't think a filter function makes sense here. I have 2,000 tenants in the source database, and I want all records for a single tenant in a tenant-specific topic. So, with a filter function, if I understand it correctly, I would need 2,000 different filters, which

Re: PyFlink kafka producer topic override

2021-06-23 Thread Dian Fu
You are right that split is still not supported. Does it make sense for you to split the stream using a filter function? There is some overhead compared the built-in stream.split as you need to provide a filter function for each sub-stream and so a record will evaluated multiple times. > 2021年6