Hello kafka community,
Sorry, besides the 3 kafka streams that just merge topics into another
common topic, there must be consumers from those merged topics, that keep
local state, and that, once both t1 and t2 values for a particular t1 key
exist, emit the pair, and will keep emitting pairs multiple times for each
t1 key as long as it has new values in either t1 or t2, with the condition
that both t1 and t2 records are recorded in the local state for a
particular t1 key.

So these consumers would do the join itself with the help of a local
database.

I know it sounds a lot like RocksDb from kafka-streams but this is what I
can understand how it would work and prevent unsafe concurrency (e.g. race
conditions).

So each consumer consumes from a single topic with 2 event/message types
and as soon as it has a message of type t1 to which a local state t2 pair
exists (or the other way around), it emits (sends) a message to its output
(join) topic.

Can this system, of 3 merges (kafka streams) + 3 joins (consumers), which
takes care for things to be single threaded at the right shards/partitions,
and not to lose pairs or combinations due to incorrect concurrency, can
this be done in a simpler way?

Thanks,
Nicolae

On Mon, 13 Jul 2020 at 20:30, Dumitru-Nicolae Marasoui <
nicolae.maras...@kaluza.com> wrote:

>
> Hello kafka community,
> I imagine the following behavior that I could code in 3 kafka-streams
> pipelines and wondering if it can be done in fewer kafka streams with the
> same guarantees:
>
> I have 3 compacted topics, t1, t2 and t3, where t2 is the link (many-many)
> between t1 & t3.
> The same about t4-t6.
>
> I need to replicate t1-3 in t4-6 with a slightly different domain and move
> a column (avro attribute) from t1 to t6 to adapt the domains.
>
> In a multi-kafka-streams-pipelines paradigm I could do the following:
>
> I would create 2 or 3 intermediary topics:
> - a topic keyed in t1 key which gets events copies from both t1 & t2, both
> message types keyed in t1 key (kafka-streams-1 which merges 2 streams based
> on t1 & t2 with slight re-keying)
> - a topic keyed in t3 key which gets events copies from both t3 & t2, both
> message types keyed in t3 key (kafka-streams-2 which merges 2 streams based
> on t3 & t2 with slight re-keying)
>
> Until now I have created 2 joins, and I know joins exist in KafkaStreams,
> but these joins I can understand with my mind why they would be
> linearizable, since the messages on topics t1 & t2 will be sent to a common
> pipe with a total order between messages keyed in any particular t1 key, so
> the state processing would be linearizable at the level of each t1 entity
> (including t1 messages & t2 messages).
>
> Now the output topics are having the same key: t2 key (which is t1+t3 keys
> combined).
>
> Now I can join these topics, again sending them both to a 3rd topic with
> this join via merge strategy which i understand will not lose combinations
> because it cannot allow concurrency, so that all the messages keyed in a
> specific t2 key (a specific t1+t3 key combination) are read in order and
> applied single thread fashion, linearizable (kafka-streams-3).
>
> So from these 2+1 pipelines I can have an output back to t6 where I
> rewrite t6 records with records that contain a new value that is taken from
> t1.
>
> Does this make sense?
> Would you think that it is doable in less pipelines?
> Would using joins instead of these merges allow any such guarantees of
> single threaded processing across topics? I think not?
> Thank you,
>
> --
>
> Dumitru-Nicolae Marasoui
>
> Software Engineer
>
>
>
> w kaluza.com <https://www.kaluza.com/>
>
> LinkedIn <https://www.linkedin.com/company/kaluza> | Twitter
> <https://twitter.com/Kaluza_tech>
>
> Kaluza Ltd. registered in England and Wales No. 08785057
>
> VAT No. 100119879
>
> Help save paper - do you need to print this email?
>


-- 

Dumitru-Nicolae Marasoui

Software Engineer



w kaluza.com <https://www.kaluza.com/>

LinkedIn <https://www.linkedin.com/company/kaluza> | Twitter
<https://twitter.com/Kaluza_tech>

Kaluza Ltd. registered in England and Wales No. 08785057

VAT No. 100119879

Help save paper - do you need to print this email?

Reply via email to