Hi Imcom Jin, Thanks for your question!
It is expected behavior that Connect's internal topics are read completely from the beginning each time the worker starts, regardless of the auto.offset.reset configuration [1]. This is because they are compacted topics, and the first message in the topic may be necessary for correctness reasons. For example, if a worker only reads from the latest offset of the status topic, it may not know the status of long-running stable tasks. If you want to reduce the startup time, I suggest reducing the segment rolling configurations [2,3] for the internal topics. This will permit Kafka to compact away the duplicate status messages sooner, preventing them from being read on a future startup. This was previously reported [4] but we have not yet changed the default. I hope this helps, Greg [1] https://github.com/apache/kafka/blob/c4fb1008c4856c8cf9594269c86323753e6860ce/connect/runtime/src/main/java/org/apache/kafka/connect/util/KafkaBasedLog.java#L274-L278 [2] https://kafka.apache.org/documentation/#topicconfigs_segment.bytes [3] https://kafka.apache.org/documentation/#topicconfigs_segment.ms [4] https://issues.apache.org/jira/browse/KAFKA-15086 On Thu, Aug 14, 2025 at 9:44 AM Imcom JIN <imcom....@nexusguard.com> wrote: > Hi dear Kafka team, > > I see that no matter what properties I give to the connector, the offset > reset config for internal topics, especially the offset storage topic, say > my-connect-offsets always use "earliest" which leads to very long bootstrap > time during restart or stuck workers > > Log sample and config sample print in the log > > 2025-08-12 10:10:45,531 INFO [Consumer > clientId=cbdhk04-data-cluster-offsets, groupId=cbdhk04-data-cluster] > Seeking to earliest offset of partition > > root@cbd:/usr/local/nxg/docker/kafka-connect# docker logs > connect-replication-8085 | grep "auto.offset.reset = earliest" -C2 > auto.commit.interval.ms = 5000 > auto.include.jmx.reporter = true > auto.offset.reset = earliest > > My connect-districuted.properties contains the following config > > producer.override.auto.offset.reset=latest > consumer.override.auto.offset.reset=latest > producer.auto.offset.reset=latest > consumer.auto.offset.reset=latest > auto.offset.reset=latest > connector.client.config.override.policy=All > > None of the above can change the behaviour of the consumer initialized by > connect to consume internal topics. > > What's the expected behaviour? How to improve the bootstrap time for havey > connect cluster? > What properties should I use to change the consumer config if possible at > all. > > Thanks in advance > > -- > *Imcom Jin* > Software Engineer Manager, SEG > T : +8613552756336 > > *NEXUSGUARD* > www.nexusguard.com > LinkedIn <https://www.linkedin.com/company/nexusguard> • Twitter > <https://www.twitter.com/nexusguard> • Facebook > <https://www.facebook.com/nxg.pr> > > > > Disclaimer: This e-mail message contains information intended solely for > the intended recipient and is confidential or private in nature. If you are > not the intended recipient, you must not read, disseminate, distribute, > copy or otherwise use this message or any file attached to this message. > Any such unauthorized use is prohibited and may be unlawful. If you have > received this message in error, please notify the sender immediately by > email, facsimile or telephone and then delete the original message from > your machine. >