Maybe you can reduce the *batch.size* and *linger.ms <http://linger.ms>* value and see if that issue still happens. Also check the *acks* value since it'll also affect the response time before collecting enough acks from nodes.
On Wed, Sep 16, 2020 at 10:56 AM Navneeth Krishnan <reachnavnee...@gmail.com> wrote: > Hi Shaohan, > > Thanks for your inputs. I did look at request total time and request queue > time metrics and it looks to be normal. The broker was restarted a couple > times and I saw some spikes during that time alone. The kafka broker logs > don't have any errors at all. The issue started around 11:30 and it's still > happening. > > [image: image.png] > [image: image.png] > > Thanks > > On Tue, Sep 15, 2020 at 7:17 PM Shaohan Yin <shaohan....@gmail.com> wrote: > >> You could check if there are any performance issues on the broker. >> >> Broker metrics such as kafka_network_requestmetrics_totaltimems and >> kafka_network_requestmetrics_requestqueuetimems might do you a favor. >> >> And your acks config would also affect the e2e latency >> >> On Wed, 16 Sep 2020 at 05:08, Navneeth Krishnan <reachnavnee...@gmail.com >> > >> wrote: >> >> > Hi All, >> > >> > We are running kafka in production with 20 brokers and version 2.3.1. I >> see >> > the below errors frequently happening and here is the producer >> > configuration. Need some urgent help on what could be done to resolve >> this >> > issue. >> > >> > batch.size: 65536 >> > linger.ms: 100 >> > request.timeout.ms: 60000 >> > >> > org.apache.kafka.common.errors.TimeoutException: Expiring record(s) >> > >> > org.apache.kafka.common.errors.NetworkException: The server disconnected >> > before a response was received. >> > >> > org.apache.kafka.common.errors.NetworkException: The server disconnected >> > before a response was received.. Going to request metadata update now >> > >> > Thanks >> > >> >