[jira] [Commented] (KAFKA-7777) Decouple topic serdes from materialized serdes

Paul Whalen (JIRA) Tue, 19 Mar 2019 18:13:50 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-7777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796661#comment-16796661
 ]


Paul Whalen commented on KAFKA-7777:
------------------------------------

So I just went through looking at all the relevant interfaces again and I think 
I stand corrected.  Fundamentally, the API I was imagining is just:

# Some thing that you can send a key/value to and it will write the appropriate 
records to an appropriately named changelog topic.
# Supplying a callback to restore from a topic when a state store is 
initialized (I know that exists, though I will admit that one of my colleagues 
spent a morning trying to accomplish that and failed to find an online example 
or get anything working)

I see {{StoreChangeLogger}} as the solution to 1, and although it is not 
public, it is obviously small and replicable, and now that I see that 
{{ProcessorContext}} implements {{RecordCollector.Supplier}} allowing the 
all-important "hook in" so we can get EoS by using the same consumer.  And we 
can choose an appropriate topic name of course from the public 
{{ProcessorStateManager.storeChangelogTopic()}}

And, I'm sure 2 is perfectly solvable given the right understanding.

Thanks for your help!  The best kind of new feature is the kind that existed 
all along!

> Decouple topic serdes from materialized serdes
> ----------------------------------------------
>
>                 Key: KAFKA-7777
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7777
>             Project: Kafka
>          Issue Type: Wish
>          Components: streams
>            Reporter: Maarten
>            Priority: Minor
>              Labels: needs-kip
>
> It would be valuable to us to have the the encoding format in a Kafka topic 
> decoupled from the encoding format used to cache the data locally in a kafka 
> streams app. 
> We would like to use the `range()` function in the interactive queries API to 
> look up a series of results, but can't with our encoding scheme due to our 
> keys being variable length.
> We use protobuf, but based on what I've read Avro, Flatbuffers and Cap'n 
> proto have similar problems.
> Currently we use the following code to work around this problem:
> {code}
> builder
>     .stream("input-topic", Consumed.with(inputKeySerde, inputValueSerde))
>     .to("intermediate-topic", Produced.with(intermediateKeySerde, 
> intermediateValueSerde)); 
> t1 = builder
>     .table("intermediate-topic", Consumed.with(intermediateKeySerde, 
> intermediateValueSerde), t1Materialized);
> {code}
> With the encoding formats decoupled, the code above could be reduced to a 
> single step, not requiring an intermediate topic.
> Based on feedback on my [SO 
> question|https://stackoverflow.com/questions/53913571/is-there-a-way-to-separate-kafka-topic-serdes-from-materialized-serdes]
>  a change that introduces this would impact state restoration when using an 
> input topic for recovery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (KAFKA-7777) Decouple topic serdes from materialized serdes

Reply via email to