Hello Liam, Here is the image. I hope it is accessible now
*Regards,* *Jigar* On Fri, 28 Jan 2022 at 15:04, Liam Clarke-Hutchinson <lclar...@redhat.com> wrote: > Hi Jigar, > > Your image attachment didn't come through again. > > Thanks, > > Liam > > On Fri, 28 Jan 2022, 5:35 pm Jigar Shah, <jigar.shah1...@gmail.com> wrote: > > > Hello again, > > Could someone please provide feedback on these findings ? > > Thank you in advance for feedback. > > > > *Regards,* > > *Jigar* > > > > > > > > On Mon, 17 Jan 2022 at 13:24, Jigar Shah <jigar.shah1...@gmail.com> > wrote: > > > >> Hello again, > >> I had performed a few more tests on producer and consumer again and I > >> observed a pattern in Kafka Producer creating large latency. > >> Could you please confirm that my understanding is correct about the > >> producer protocol? > >> > >> The configurations are the same as above. > >> > >> The producer is continuously producing messages into kafka topic, using > >> the default producer partitioner creating messages in random > >> topic-partitions > >> > >> The workflow of protocol according to my understanding is: > >> 1. First connection from producer to a broker (1 out of 3) in the > cluster > >> to fetch metadata. > >> 2. If the partition to produce is located on the same broker then > >> a. Re-use the existing connection to produce messages. > >> 3. Else if the partition to produce is located on one of other brokers > >> then > >> a. Create a new connection > >> b. Fetch again metadata. > >> c. Produce the message using the new connection > >> > >> After analysis, I assume the latency is caused at step *3.a & 3.b *when > >> the partition selected is on the other two brokers. Such peaks are > >> observed during initial part of test only > >> [image: image.png] > >> Thank you in advance for feedback. > >> > >> *Regards,* > >> *Jigar* > >> > >> > >> On Wed, 15 Dec 2021 at 10:53, Jigar Shah <jigar.shah1...@gmail.com> > >> wrote: > >> > >>> Hello, > >>> I agree with time taken for consumer initialization processes > >>> But actually in the test I am taking care of that and I am waiting for > >>> the consumer to be initiated and only then starting the producer to > >>> discount the initialization delay. > >>> So, are there any more processes happening during the poll of consumers > >>> for the first few messages? > >>> > >>> Thank you > >>> > >>> On Mon, 13 Dec 2021 at 18:33, Luke Chen <show...@gmail.com> wrote: > >>> > >>>> Hi Jigar, > >>>> > >>>> As Liam mentioned, those are necessary consumer initialization > >>>> processes. > >>>> So, I don't think you can speed it up by altering some > timeouts/interval > >>>> properties. > >>>> Is there any reason why you need to care about the initial delay? > >>>> If, like you said, the delay won't happen later on, I think the cost > >>>> will > >>>> be amortized. > >>>> > >>>> > >>>> Thank you. > >>>> Luke > >>>> > >>>> > >>>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <jigar.shah1...@gmail.com> > >>>> wrote: > >>>> > >>>> > Hello , > >>>> > Answering your first mail, indeed I am using consumer groups using > >>>> > group.id > >>>> > , I must have missed to add it in mentioned properties > >>>> > Also, thank you for information regarding the internal processes > >>>> working > >>>> > behind creating a KafkaConsumer. > >>>> > I agree that following steps do add latency during initial > connection > >>>> > creation.But can it be somehow optimised(reduced) ,by altering some > >>>> > timeouts/interval properties, could you please suggest those? > >>>> > > >>>> > Thank you > >>>> > > >>>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson < > >>>> lclar...@redhat.com> > >>>> > wrote: > >>>> > > >>>> > > I realise that's a silly question, you must be if you're using > auto > >>>> > commit. > >>>> > > > >>>> > > When a consumer starts, it needs to do a few things. > >>>> > > > >>>> > > 1) Connect to a bootstrap server > >>>> > > > >>>> > > 2) Join an existing consumer group, or create a new one, if it > >>>> doesn't > >>>> > > exist. This may cause a stop the world rebalance as partitions are > >>>> > > reassigned within the group. > >>>> > > > >>>> > > 3) Acquire metadata - which brokers are the partition leaders for > my > >>>> > > assigned partitions on? And what offsets am I consuming from? > >>>> > > > >>>> > > 4) Establish the long lived connections to those brokers. > >>>> > > > >>>> > > 5) Send fetch requests > >>>> > > > >>>> > > (I might not have the order correct) > >>>> > > > >>>> > > So yeah, this is why you're seeing that initial delay before > >>>> consuming > >>>> > > records. > >>>> > > > >>>> > > Kind regards, > >>>> > > > >>>> > > Liam Clarke-Hutchinson > >>>> > > > >>>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, < > >>>> > lclar...@redhat.com> > >>>> > > wrote: > >>>> > > > >>>> > > > Hi, > >>>> > > > > >>>> > > > I'm assuming you're using consumer groups? E.g., group.id=X > >>>> > > > > >>>> > > > Cheers, > >>>> > > > > >>>> > > > Liam > >>>> > > > > >>>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, < > >>>> jigar.shah1...@gmail.com> > >>>> > > wrote: > >>>> > > > > >>>> > > >> Hello, > >>>> > > >> I am trying to test the latency between message production and > >>>> message > >>>> > > >> consumption using Java Kafka-Client*(2.7.2)* library. > >>>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala > >>>> 2.13)*, 3 > >>>> > > >> Zookeeper*(3.5.9)* > >>>> > > >> Here is a pattern what I have observed > >>>> > > >> Reference: > >>>> > > >> ConsumerReadTimeStamp: Timestamp when record received in Kafka > >>>> > Consumer > >>>> > > >> ProducerTimeStamp: Timestamp added before producer.send record > >>>> > > >> RecordTimeStamp: CreateTimeStamp inside the record obtained at > >>>> > consumer > >>>> > > >> > >>>> > > >> [image: kafka1.png] > >>>> > > >> > >>>> > > >> *For 100 Messages* > >>>> > > >> > >>>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)* > >>>> > > >> > >>>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)* > >>>> > > >> > >>>> > > >> *Average* > >>>> > > >> > >>>> > > >> *252.56* > >>>> > > >> > >>>> > > >> *238.85* > >>>> > > >> > >>>> > > >> *Max* > >>>> > > >> > >>>> > > >> *2723* > >>>> > > >> > >>>> > > >> *2016* > >>>> > > >> > >>>> > > >> *Min* > >>>> > > >> > >>>> > > >> *125* > >>>> > > >> > >>>> > > >> *125* > >>>> > > >> > >>>> > > >> > >>>> > > >> On the consumer side it takes too much time for initial few > >>>> messages > >>>> > but > >>>> > > >> later on it is quite consistent. > >>>> > > >> I have executed the above same test for large number of > messages > >>>> : > >>>> > > >> 100,1000,10000,etc. and the pattern seems to be same > >>>> > > >> Here are the configurations, mostly using default properties. > >>>> > > >> Topic: > >>>> > > >> partitions=16 > >>>> > > >> min.insync.replica=2 > >>>> > > >> replication.factor=3 > >>>> > > >> > >>>> > > >> > >>>> > > >> Consumer: > >>>> > > >> security.protocol=PLAINTEXT > >>>> > > >> enable.auto.commit=true > >>>> > > >> > >>>> > > >> > >>>> > > >> Producer: > >>>> > > >> security.protocol=PLAINTEXT > >>>> > > >> compression.type=gzip > >>>> > > >> acks=all > >>>> > > >> > >>>> > > >> > >>>> > > >> Is there any reason why there is huge latency at the beginning > >>>> when a > >>>> > > >> consumer is created please? > >>>> > > >> Also please suggest some way to optimise configurations to have > >>>> some > >>>> > > >> better consistent results ? > >>>> > > >> > >>>> > > >> Thank you in advance for your feedback. > >>>> > > >> > >>>> > > > > >>>> > > > >>>> > > >>>> > >>> >