As you enable idempotance, you should set retries to `Integer.MAX_VALUES` -- for newer version in which the default is MAX_VALUES you can of course remove the config.
This will give you strict ordering guarantees, assuming that your topic is configured correctly, ie, `min.insync.replicats=2` and `replication.factor=3` (to allow for a single broker failure without loosing data and don't loosing availability). With retries=1 I am not surprised that you get exceptions if a broker fail over occurs. > Allowing retries without setting `max.in.flight.requests.per.connection` to > `1` will potentially change the ordering of records... This applies only if idempotance is disabled. Hence, you can leave max.in.flight config at `4` and still have ordering guarantees. -Matthias On 11/8/19 9:36 AM, M. Manna wrote: > Hi, > > On Fri, 8 Nov 2019 at 17:19, Jose Manuel Vega Monroy < > jose.mon...@williamhill.com> wrote: > >> Hi there, >> >> >> >> I have a question about message order and retries. >> >> >> >> After checking official documentation, and asking your feedback, we set >> this kafka client configuration in each producer: >> >> >> >> retries = 1 >> >> # note to ensure order enable.idempotence=true, which forcing to >> acks=all and max.in.flight.requests.per.connection<=5 >> >> enable.idempotence = true >> >> max.in.flight.requests.per.connection = 4 >> >> acks = "all" >> >> >> > The documentation also says: > >> Allowing retries without setting max.in.flight.requests.per.connection to >> 1 will potentially change the ordering of records because if two batches >> are sent to a single partition, and the first fails and is retried but the >> second succeeds, then the records in the second batch may appear first. >> Note additionally that produce requests will be failed before the number of >> retries has been exhausted if the timeout configured by >> delivery.timeout.ms expires first before successful acknowledgement. >> Users should generally prefer to leave this config unset and instead use >> delivery.timeout.ms to control retry behavior. > > > Are you planning to do it via delivery.timeout.ms? > > >> However, somehow while rolling upgrade, we saw producer retrying a lot of >> times (for example, 16 times), and finally sending fine when broker was up >> and running back, with exceptions like this: >> >> >> >> Cause: org.apache.kafka.common.errors.OutOfOrderSequenceException: The >> broker received an out of order sequence number.. >> >> Cause: org.apache.kafka.common.errors.NotLeaderForPartitionException: This >> server is not the leader for that topic-partition.. >> >> >> >> Is that behaviour expected? It’s that retries configuration right trying >> to ensure the message order, or maybe we should remove retries >> configuration from our producers? >> >> >> >> As well we found this related to retries: >> >> >> >> The default value for the producer's retries config was changed to >> Integer.MAX_VALUE, as we introduced delivery.timeout.ms in KIP-91, which >> sets an upper bound on the total time between sending a record and >> receiving acknowledgement from the broker. By default, the delivery timeout >> is set to 2 minutes. >> >> >> >> Allowing retries without setting `max.in.flight.requests.per.connection` to >> `1` will potentially change the ordering of records because if two batches >> are sent to a single partition, and the first fails and is retried but the >> second succeeds, then the records in the second batch may appear first. Note >> additionally that produce requests will be failed before the number of >> retries has been exhausted if the timeout configured by delivery.timeout.ms >> expires first before successful acknowledgement. Users should generally >> prefer to leave this config unset and instead use delivery.timeout.ms to >> control retry behavior. >> >> >> >> Note this was faced while rolling upgrade from 2.1.1 to 2.2.1. >> >> >> >> Thanks >> >> >> >> [image: >> https://www.williamhillplc.com/content/signature/WHlogo.gif?width=180] >> <http://www.williamhill.com/> >> >> [image: >> https://www.williamhillplc.com/content/signature/senet.gif?width=180] >> <http://www.whenthefunstops.co.uk/> >> >> *Jose Manuel Vega Monroy * >> *Java Developer / Software Developer Engineer in Test* >> >> Direct: +*0035 0 2008038 (Ext. 8038)* >> Email: jose.mon...@williamhill.com >> >> William Hill | 6/1 Waterport Place | Gibraltar | GX11 1AA >> >> >> >> >> >> >> Confidentiality: The contents of this e-mail and any attachments >> transmitted with it are intended to be confidential to the intended >> recipient; and may be privileged or otherwise protected from disclosure. If >> you are not an intended recipient of this e-mail, do not duplicate or >> redistribute it by any means. Please delete it and any attachments and >> notify the sender that you have received it in error. This e-mail is sent >> by a William Hill PLC group company. The William Hill group companies >> include, among others, William Hill PLC (registered number 4212563), >> William Hill Organization Limited (registered number 278208), William Hill >> US HoldCo Inc, WHG (International) Limited (registered number 99191) and Mr >> Green Limited (registered number C43260). Each of William Hill PLC and >> William Hill Organization Limited is registered in England and Wales and >> has its registered office at 1 Bedford Avenue, London, WC1B 3AU, UK. >> William Hill U.S. HoldCo, Inc. is registered in Delaware and has its >> registered office at 1007 N. Orange Street, 9 Floor, Wilmington, New Castle >> County DE 19801 Delaware, United States of America. WHG (International) >> Limited is registered in Gibraltar and has its registered office at 6/1 >> Waterport Place, Gibraltar. Mr Green Limited is registered in Malta and has >> its registered office at Tagliaferro Business Centre, Level 7, 14 High >> Street, Sliema SLM 1549, Malta. Unless specifically indicated otherwise, >> the contents of this e-mail are subject to contract; and are not an >> official statement, and do not necessarily represent the views, of William >> Hill PLC, its subsidiaries or affiliated companies. Please note that >> neither William Hill PLC, nor its subsidiaries and affiliated companies can >> accept any responsibility for any viruses contained within this e-mail and >> it is your responsibility to scan any emails and their attachments. William >> Hill PLC, its subsidiaries and affiliated companies may monitor e-mail >> traffic data and also the content of e-mails for effective operation of the >> e-mail system, or for security, purposes. >> >
signature.asc
Description: OpenPGP digital signature