Re: PyFlink kafka producer topic override

Curt Buechter Wed, 23 Jun 2021 20:46:06 -0700

Hi Dian,
Thanks for the reply.
I don't think a filter function makes sense here. I have 2,000 tenants in
the source database, and I want all records for a single tenant in a
tenant-specific topic. So, with a filter function, if I understand it
correctly, I would need 2,000 different filters, which isn't very practical.


An example:
source_topic(tenant_id, first_name, last_name)

destination:
tenant1.sink_topic (first_name, last_name)
tenant2.sink_topic (first_name, last_name)
...
tenant2000.sink_topic (first_name, last_name)

On 2021/06/24 03:18:36, Dian Fu <d...@gmail.com> wrote:
> You are right that split is still not supported. Does it make sense for
you to split the stream using a filter function? There is some overhead
compared the built-in stream.split as you need to provide a filter function
for each sub-stream and so a record will evaluated multiple times.>
>
> > 2021年6月24日 上午3:08，Curt Buechter <tr...@gmail.com> 写道：>
> > >
> > Hi,>
> > New PyFlink user here. Loving it so far. The first major problem I've
run into is that I cannot create a Kafka Producer with dynamic topics. I
see that this has been available for quite some time in Java with Keyed
Serialization using the getTargetTopic method. Another way to do this in
Java may be with stream.split(), and adding a different sink for the split
streams. Stream splitting is also not available in PyFlink.>
> > Am I missing anything? Has anyone implemented this before in PyFlink,
or know of a way to make it happen?>
> > The use case here is that I'm using a CDC Debezium connector to
populate kafka topics from a multi-tenant database, and I'm trying to use
PyFlink to split the records into a different topic for each tenant.>
> > >
> > Thanks>
>
>

Re: PyFlink kafka producer topic override

Reply via email to