> On April 14, 2016, 8:43 p.m., Chris Pettitt wrote: > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala, > > line 44 > > <https://reviews.apache.org/r/45258/diff/1/?file=1312769#file1312769line44> > > > > Is 30 arbitrary or is there some significance to the number?
The choice of 30 was arbitrary. The goal is to abandon retrying after some number of retries. (if we keep getting exceptions due to broker side errors). > On April 14, 2016, 8:43 p.m., Chris Pettitt wrote: > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala, > > line 71 > > <https://reviews.apache.org/r/45258/diff/1/?file=1312769#file1312769line71> > > > > This is not directly related to your commit, but: > > > > This function is not reentrant. Is that expected? For example, it would > > be possible for a the Kafka callback to set sendFailed to true but a > > subsequent send would reset this flag. > > > > > > --- > > > > Separately, there is an implicit assumption that retryBackoff is > > providing a happens-before constraint between an invocation of the > > exception handler in retryBackoff and a subsequent invocation of the loop. > > This likely always holds, but a CAS for numRetries would cover cases where > > that does not hold. Yes, that's true. 1.It's entirely possible for a Kafka callback to set sendFailed to true, and a subsequent send to reset this flag. Ideally, errors in the callbacks should shut-down the producer in the same callback thread. 2. It's also possible for a subsequent send (for msg2) to be processed even if the send for the previous message (msg1) fails. This is the race in the producer that Navina mentioned in her review of this RB. (and requested that a separate jira be created to track). SAMZA-934 tracks this. - Jagadish ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45258/#review128972 ----------------------------------------------------------- On March 24, 2016, 1:10 a.m., Jagadish Venkatraman wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/45258/ > ----------------------------------------------------------- > > (Updated March 24, 2016, 1:10 a.m.) > > > Review request for samza, Jake Maes, Navina Ramesh, and Yi Pan (Data > Infrastructure). > > > Repository: samza > > > Description > ------- > > Currently, the KafkaSystemProducer's producer loop keeps retrying > indefinitely when there is an exception in the retryBackOff loop. If there > are repeated exceptions, then it makes sense to retry for awhile, and then > fail the container. > > Long term, we should focus on getting rid off the retryBackOff loop, and > close the producer object in the callback during failure. This will guarantee > in-order delivery. > > 1.Modified the KafkaSystemProducer to take a maxRetries. (currently, its set > to 30). > 2.Add tests to verify retry in case of RetriableExceptions. > > > Diffs > ----- > > > samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala > 9a44d46d29a1997958a9d2bbf7be0bde860fff64 > > samza-kafka/src/test/scala/org/apache/samza/system/kafka/TestKafkaSystemProducer.scala > 39426d8cf64516ec4fdc0cb4ff60b1df3a757470 > > Diff: https://reviews.apache.org/r/45258/diff/ > > > Testing > ------- > > Added unit tests to verify functionality. > > > Thanks, > > Jagadish Venkatraman > >