gharris1727 opened a new pull request, #15154:
URL: https://github.com/apache/kafka/pull/15154

   The single ProcessingContext in the RetryWithToleranceOperator is subject to 
a data race when interacted with across multiple threads. Specifically, when 
the main task thread is converting records, an asynchronous error (from 
producer or errant record reporter) can intervene, and cause a non-problematic 
record to be incorrectly dropped silently.
   
   To fix this, make ProcessingContext a per-record object which is passed 
between threads (task-main-thread -> producer io thread/task-main-thread -> 
task internal thread) but never accessed concurrently. This has a number of 
benefits:
   
   1. No locking on the happy-path, where before every RWTO operation was 
synchronized (failures are still synchronized to collect the totalFailures & 
report errors)
   2. Less mutability in the ProcessingContext which is easier to reason about
   3. Easier to add parallelism to the transformation chain in the future
   
   This comes at the cost of additional memory pressure from allocating more 
ProcessingContext objects, which should be negligible compared to the 
ConnectRecords themselves, and comparable to the existing InternalSinkRecord.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to