Hello again, Could someone please provide feedback on these findings ? Thank you in advance for feedback.
*Regards,* *Jigar* On Mon, 17 Jan 2022 at 13:24, Jigar Shah <jigar.shah1...@gmail.com> wrote: > Hello again, > I had performed a few more tests on producer and consumer again and I > observed a pattern in Kafka Producer creating large latency. > Could you please confirm that my understanding is correct about the > producer protocol? > > The configurations are the same as above. > > The producer is continuously producing messages into kafka topic, using > the default producer partitioner creating messages in random > topic-partitions > > The workflow of protocol according to my understanding is: > 1. First connection from producer to a broker (1 out of 3) in the cluster > to fetch metadata. > 2. If the partition to produce is located on the same broker then > a. Re-use the existing connection to produce messages. > 3. Else if the partition to produce is located on one of other brokers then > a. Create a new connection > b. Fetch again metadata. > c. Produce the message using the new connection > > After analysis, I assume the latency is caused at step *3.a & 3.b *when > the partition selected is on the other two brokers. Such peaks are > observed during initial part of test only > [image: image.png] > Thank you in advance for feedback. > > *Regards,* > *Jigar* > > > On Wed, 15 Dec 2021 at 10:53, Jigar Shah <jigar.shah1...@gmail.com> wrote: > >> Hello, >> I agree with time taken for consumer initialization processes >> But actually in the test I am taking care of that and I am waiting for >> the consumer to be initiated and only then starting the producer to >> discount the initialization delay. >> So, are there any more processes happening during the poll of consumers >> for the first few messages? >> >> Thank you >> >> On Mon, 13 Dec 2021 at 18:33, Luke Chen <show...@gmail.com> wrote: >> >>> Hi Jigar, >>> >>> As Liam mentioned, those are necessary consumer initialization processes. >>> So, I don't think you can speed it up by altering some timeouts/interval >>> properties. >>> Is there any reason why you need to care about the initial delay? >>> If, like you said, the delay won't happen later on, I think the cost will >>> be amortized. >>> >>> >>> Thank you. >>> Luke >>> >>> >>> On Mon, Dec 13, 2021 at 4:59 PM Jigar Shah <jigar.shah1...@gmail.com> >>> wrote: >>> >>> > Hello , >>> > Answering your first mail, indeed I am using consumer groups using >>> > group.id >>> > , I must have missed to add it in mentioned properties >>> > Also, thank you for information regarding the internal processes >>> working >>> > behind creating a KafkaConsumer. >>> > I agree that following steps do add latency during initial connection >>> > creation.But can it be somehow optimised(reduced) ,by altering some >>> > timeouts/interval properties, could you please suggest those? >>> > >>> > Thank you >>> > >>> > On Mon, 13 Dec 2021 at 12:05, Liam Clarke-Hutchinson < >>> lclar...@redhat.com> >>> > wrote: >>> > >>> > > I realise that's a silly question, you must be if you're using auto >>> > commit. >>> > > >>> > > When a consumer starts, it needs to do a few things. >>> > > >>> > > 1) Connect to a bootstrap server >>> > > >>> > > 2) Join an existing consumer group, or create a new one, if it >>> doesn't >>> > > exist. This may cause a stop the world rebalance as partitions are >>> > > reassigned within the group. >>> > > >>> > > 3) Acquire metadata - which brokers are the partition leaders for my >>> > > assigned partitions on? And what offsets am I consuming from? >>> > > >>> > > 4) Establish the long lived connections to those brokers. >>> > > >>> > > 5) Send fetch requests >>> > > >>> > > (I might not have the order correct) >>> > > >>> > > So yeah, this is why you're seeing that initial delay before >>> consuming >>> > > records. >>> > > >>> > > Kind regards, >>> > > >>> > > Liam Clarke-Hutchinson >>> > > >>> > > On Mon, 13 Dec 2021, 7:19 pm Liam Clarke-Hutchinson, < >>> > lclar...@redhat.com> >>> > > wrote: >>> > > >>> > > > Hi, >>> > > > >>> > > > I'm assuming you're using consumer groups? E.g., group.id=X >>> > > > >>> > > > Cheers, >>> > > > >>> > > > Liam >>> > > > >>> > > > On Mon, 13 Dec 2021, 6:30 pm Jigar Shah, <jigar.shah1...@gmail.com >>> > >>> > > wrote: >>> > > > >>> > > >> Hello, >>> > > >> I am trying to test the latency between message production and >>> message >>> > > >> consumption using Java Kafka-Client*(2.7.2)* library. >>> > > >> The configuration of cluster is 3 KafkaBrokers*(2.7.2, Scala >>> 2.13)*, 3 >>> > > >> Zookeeper*(3.5.9)* >>> > > >> Here is a pattern what I have observed >>> > > >> Reference: >>> > > >> ConsumerReadTimeStamp: Timestamp when record received in Kafka >>> > Consumer >>> > > >> ProducerTimeStamp: Timestamp added before producer.send record >>> > > >> RecordTimeStamp: CreateTimeStamp inside the record obtained at >>> > consumer >>> > > >> >>> > > >> [image: kafka1.png] >>> > > >> >>> > > >> *For 100 Messages* >>> > > >> >>> > > >> *ConsumerReadTimeStamp-ProducerTimeStamp(ms)* >>> > > >> >>> > > >> *ConsumerReadTimeStamp-RecordTimeStamp(ms)* >>> > > >> >>> > > >> *Average* >>> > > >> >>> > > >> *252.56* >>> > > >> >>> > > >> *238.85* >>> > > >> >>> > > >> *Max* >>> > > >> >>> > > >> *2723* >>> > > >> >>> > > >> *2016* >>> > > >> >>> > > >> *Min* >>> > > >> >>> > > >> *125* >>> > > >> >>> > > >> *125* >>> > > >> >>> > > >> >>> > > >> On the consumer side it takes too much time for initial few >>> messages >>> > but >>> > > >> later on it is quite consistent. >>> > > >> I have executed the above same test for large number of messages : >>> > > >> 100,1000,10000,etc. and the pattern seems to be same >>> > > >> Here are the configurations, mostly using default properties. >>> > > >> Topic: >>> > > >> partitions=16 >>> > > >> min.insync.replica=2 >>> > > >> replication.factor=3 >>> > > >> >>> > > >> >>> > > >> Consumer: >>> > > >> security.protocol=PLAINTEXT >>> > > >> enable.auto.commit=true >>> > > >> >>> > > >> >>> > > >> Producer: >>> > > >> security.protocol=PLAINTEXT >>> > > >> compression.type=gzip >>> > > >> acks=all >>> > > >> >>> > > >> >>> > > >> Is there any reason why there is huge latency at the beginning >>> when a >>> > > >> consumer is created please? >>> > > >> Also please suggest some way to optimise configurations to have >>> some >>> > > >> better consistent results ? >>> > > >> >>> > > >> Thank you in advance for your feedback. >>> > > >> >>> > > > >>> > > >>> > >>> >>