-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45258/
-----------------------------------------------------------

(Updated May 11, 2016, 5:51 p.m.)


Review request for samza, Jake Maes, Navina Ramesh, and Yi Pan (Data 
Infrastructure).


Bugs: SAMZA-911
    https://issues.apache.org/jira/browse/SAMZA-911


Repository: samza


Description (updated)
-------

Currently, the KafkaSystemProducer's producer loop keeps retrying indefinitely 
when there is an exception in the retryBackOff loop. This is problematic 
because it will completely stall the Samza container (currently 
single-threaded). We've observed multiple jobs being affected as a result of 
this because of transient kafka-broker side errors.

If there are repeated exceptions, then it makes sense to retry for awhile, and 
then fail the container.

Long term fix: We should focus on getting rid off the retryBackOff loop, and 
close the producer object in the callback during failure. Closing the producer 
object in the callback-handler thread will guarantee in-order delivery. 
(SAMZA-934)

1.Modified the KafkaSystemProducer to take a maxRetries. (currently, its set to 
30).
2.Add tests to verify retry in case of RetriableExceptions.


Diffs
-----

  
samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala
 9a44d46d29a1997958a9d2bbf7be0bde860fff64 
  
samza-kafka/src/test/scala/org/apache/samza/system/kafka/TestKafkaSystemProducer.scala
 39426d8cf64516ec4fdc0cb4ff60b1df3a757470 

Diff: https://reviews.apache.org/r/45258/diff/


Testing
-------

Added unit tests to verify functionality.


Thanks,

Jagadish Venkatraman

Reply via email to