Re: Flink Kafka producer with a topic per message

Sanne de Roever Wed, 07 Dec 2016 02:08:22 -0800

Having a had a glass of water, the following option came up.

Having more advanced Sink integrations is likely to be a more general
concern. It would be better to have a more smooth path from the cleaner
abstraction to the advanced case. A more general proposal would be to alter
the Sink interface such that with each message optionally a key-value map
can be passed. This optional key-value map would allow the sink to alter
its behavior given the hints in the map.


On Wed, Dec 7, 2016 at 10:55 AM, Sanne de Roever <sanne.de.roe...@gmail.com>
wrote:

> A first sketch
>
> Central to this functionality is Kafka's ProducerRecord.
> ProducerRecord was introduced for Kafka 0.8. This means that any
> functionality could be introduced for all Flink-Kafka connectors; as per
> https://ci.apache.org/projects/flink/flink-docs-
> release-1.0/apis/streaming/connectors/kafka.html
> ProducerRecord does two things:
>
>    - It allows a Kafka producer to send messages to different topics in
>    Kafka; this can be very helpful for message routing (I can make a more
>    formal example later)
>    - It also allows to create a key that determines the partition of the
>    message; introducing this would give Flink a more generic interface to
>    Kafka, which is a good thing.
>    - A partition can be identified by an integer or a key String that
>    will be hashed
>
> The next step would be to determine the impact on the interface of a Sink.
> Currently a Kafka sink has one topic, for example:
>
> .addSink(new FlinkKafkaProducer09[String](outputTopic, new
> SimpleStringSchema(), producerProps))
>
> In the new scenario one would like to pass not only the message to be
> sent, but also a topic string and a partition id or key (tuple-ish?). The
> next suggestion is just to start the thinking a bit; a shot in the dark. As
> somewhat blunt approach would be to map all messages to a valid
> ProducerRecord, and then to pass this ProducerRecord to the the Sink, and
> the rest is history. No attempt at abstractions are made, the reasoning
> being as follows.
>
> Evaluating I see the following. The current KafkaSink abstracts the Kafka
> functionality out on the Flink side. This is a good thing, and will work
> for most cases. Providing a tighter integration with Kafka will probably
> break down the abstraction. This seems to point into the direction of
> creating an advanced Kafka Sink. This sink gives more control, but less
> abstraction; it is for advanced applications. Any abstraction attempts will
> only create less transparency as far as I can see. The contract would not
> likely work on other queuing providers.
>
>
>
> On Wed, Dec 7, 2016 at 10:27 AM, Sanne de Roever <
> sanne.de.roe...@gmail.com> wrote:
>
>> Good questions, I will follow up piece-wise to address the different
>> questions. Could a Wiki section be an idea, before I spread the information
>> across several posts?
>>
>> On Tue, Dec 6, 2016 at 4:50 PM, Stephan Ewen <se...@apache.org> wrote:
>>
>>> You are right, it does not exist, and it would be a nice addition.
>>>
>>> Can you sketch some details on how to do that?
>>>
>>>   - Will it be a new type of producer? If yes, can as much as possible
>>> of the code be shared between the current and the new producer?
>>>   - Will it only be part of the Flink Kafka 0.10 producer?
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>>
>>> On Tue, Dec 6, 2016 at 2:23 PM, Sanne de Roever <
>>> sanne.de.roe...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Kafka producer clients for 0.10 allow the following syntax:
>>>>
>>>> producer.send(new ProducerRecord<String, String>("my-topic",
>>>> Integer.toString(i), Integer.toString(i)));
>>>>
>>>> The gist is that one producer can send messages to different topics; it
>>>> is useful for event routing ao. It makes the creation generic endpoints
>>>> easier. If I am right, Flink currently does not support this; would this be
>>>> a useful addition?
>>>>
>>>> Cheers,
>>>>
>>>> Sanne
>>>>
>>>
>>>
>>
>

Re: Flink Kafka producer with a topic per message

Reply via email to