[ https://issues.apache.org/jira/browse/KAFKA-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166602#comment-17166602 ]
Guozhang Wang commented on KAFKA-8037: -------------------------------------- Per the interaction: I'm actually suggesting that the source-topic reuse would NOT be part of the topology optimization moving forward and are ONLY be controlled by the per-store APIs, and hence it would not be related to the `topology.optimization` any more. Per the serde: the current rule is, if `Consumed` is specified only, use that for both source topic and state store, and similarly if `Materialized` is specified only, use that for both source topic and state store; if both are specified, `Consumed` serde will override `Materialized` serde. So today we are guaranteed that only one serde will be used. My previous comment was not very clear, what I meant is that for IQ we just use the only serde, whatever it is, to deserialize the bytes (note they are just raw bytes from source topics directly). > KTable restore may load bad data > -------------------------------- > > Key: KAFKA-8037 > URL: https://issues.apache.org/jira/browse/KAFKA-8037 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Priority: Minor > Labels: pull-request-available > > If an input topic contains bad data, users can specify a > `deserialization.exception.handler` to drop corrupted records on read. > However, this mechanism may be by-passed on restore. Assume a > `builder.table()` call reads and drops a corrupted record. If the table state > is lost and restored from the changelog topic, the corrupted record may be > copied into the store, because on restore plain bytes are copied. > If the KTable is used in a join, an internal `store.get()` call to lookup the > record would fail with a deserialization exception if the value part cannot > be deserialized. > GlobalKTables are affected, too (cf. KAFKA-7663 that may allow a fix for > GlobalKTable case). It's unclear to me atm, how this issue could be addressed > for KTables though. > Note, that user state stores are not affected, because they always have a > dedicated changelog topic (and don't reuse an input topic) and thus the > corrupted record would not be written into the changelog. -- This message was sent by Atlassian Jira (v8.3.4#803005)