Hello Victor,

thanks for reaching out. I am not sure if it would be easily possible to implement what you propose. The problem is really, that records might be partially processed when an error happens, and before we can figure out what the correct offset for committing is, KS needs to flush, but after we hit an error, flushing cleanly might not be possible any longer.

Beside the above issue, w/o EOS (ie, Kafka transactions), there is all kind of other scenarios that could lead to duplicate output, or reprocessing an input record a second time. So even if we can implement what you propose (we would need to investigate in more details if it might be possible or not), it could only reduce duplicates, but not fully eliminate them. Just want to make sure you are aware of this.

So if you are really interested to contribute such a feature, and you can figure out how to do this correctly, yes, please write a KIP about it :) -- We can also try to help you with this investigation (it might require some POC PR...); however, not sure how much time we can find atm to support you TBH.

However, I am wondering what kind of performance you need, and why EOS would not be able to deliver. There is many configs, so maybe there is a way to tune your app to make it work with EOS?


-Matthias


On 9/3/25 10:50 AM, Victor Osorio wrote:
Hello everyone,

We’re currently using Kafka Streams to process transactional data with *exactly-once semantics (EOS)*. However, for some of our workloads, we require higher throughput, which makes EOS impractical.

To ensure data integrity, we rely on UncaughtExceptionHandler and ProductionExceptionHandler to halt stream processing upon any exception. This prevents data loss but introduces a new challenge: when a thread stops due to an exception, it doesn’t commit the records that were already successfully processed. As a result, when the stream restarts, those records are reprocessed, leading to duplication.

While reviewing the discussion around KIP-1033, I noticed the suggestion to avoid exposing commit functionality in the Kafka Streams API (https://lists.apache.org/thread/k4v0737tqjdnq5vl3yp9rjr4qzqoo306 <https://lists.apache.org/thread/k4v0737tqjdnq5vl3yp9rjr4qzqoo306>). That makes sense in many contexts, but I’d like to revisit a related idea:

*Could we introduce a new shutdown mechanism, perhaps a “Graceful Shutdown” API, that commits all successfully processed records while skipping the one that caused the failure?*

This would allow us to maintain data integrity without sacrificing throughput or introducing duplicates. I’m curious to hear your thoughts:

  * Would this be possible to implement with current Kafka Streams APIs?
  * Is that possible, or desired, to be added as a Kafka Streams feature
    in further releases? If yes, we can open a KIP.

Looking forward to your insights and feedback.

Best regards,
Victor Osório

amdocs-2017-brand-mark-rgb

*This email and the information contained herein is proprietary and confidential and subject to the Amdocs Email Terms of Service, which you may review at**https://www.amdocs.com/about/email-terms-of-service* <https://www.amdocs.com/about/email-terms-of-service>


Reply via email to