RE: Re: Re: [DISCUSS] KIP-943: Add independent "offset.storage.segment.bytes" for connect-distributed.properties

Zhijian Chen Wed, 01 Jan 2025 18:23:04 -0800

Hi all



I encountered the same problem, it would take a long time to start the worker 
or checkpoint task. 




I also read the relevant discussion of 
KIP-943(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255073470).
 




In fact, 
KIP-605(https://cwiki.apache.org/confluence/display/KAFKA/KIP-605%3A+Expand+Connect+Worker+Internal+Topic+Settings)
 already supports the configuration of internal topic. mm2 now has the ability 
to set the segment bytes of the offset topic.




In my opinion, the key point of the problem is that there is no way to 
configure segment bytes for the internal topic used by mm2, such as offset 
sync. We can completely refer to KIP-605 to do the same implementation.




As for default values, I think we can give a configuration value and put it in 
the sample of the mm2 configuration file instead of writing it in code.




For stock users, I think it is still feasible for users to manually set the 
segment bytes of the topic, instead of forcing us to change the user's 
configuration when the code is started, and it is best for users to control it.




In fact, I've already done the implementation, hoping to merge back into the 
main branch of kafka.




 




My demo logs:




1、start worker took 3263 ms.




[2024-12-31 20:32:40,364] INFO Worker starting 
(org.apache.kafka.connect.runtime.Worker:233)




[2024-12-31 20:32:43,885] INFO read to end, 
assignment=[mm2-offsets.A.internal-3, mm2-offsets.A.internal-1, 
mm2-offsets.A.internal-24, mm2-offsets.A.internal-22, 
mm2-offsets.A.internal-15, mm2-offsets.A.internal-13, 
mm2-offsets.A.internal-19, mm2-offsets.A.internal-17, mm2-offsets.A.internal-7, 
mm2-offsets.A.internal-5, mm2-offsets.A.internal-11, mm2-offsets.A.internal-9, 
mm2-offsets.A.internal-0, mm2-offsets.A.internal-4, mm2-offsets.A.internal-2, 
mm2-offsets.A.internal-23, mm2-offsets.A.internal-21, 
mm2-offsets.A.internal-16, mm2-offsets.A.internal-14, 
mm2-offsets.A.internal-20, mm2-offsets.A.internal-18, mm2-offsets.A.internal-8, 
mm2-offsets.A.internal-6, mm2-offsets.A.internal-12, 
mm2-offsets.A.internal-10], count=70184, usedMs=3263 
(org.apache.kafka.connect.util.KafkaBasedLog:525)




[2024-12-31 20:32:43,885] INFO Finished reading offsets topic and starting 
KafkaOffsetBackingStore 
(org.apache.kafka.connect.storage.KafkaOffsetBackingStore:249)




[2024-12-31 20:32:43,888] INFO Worker started 
(org.apache.kafka.connect.runtime.Worker:243)




 




2、start checkpoint task took 4386 ms.




[2024-12-31 20:41:02,262] INFO [MirrorCheckpointConnector|task-0] read to end, 
assignment=[mm2-offset-syncs.A.internal-0], count=311839, usedMs=4160 
(org.apache.kafka.connect.util.KafkaBasedLog:525)




[2024-12-31 20:41:02,262] INFO [MirrorCheckpointConnector|task-0] Finished 
reading KafkaBasedLog for topic mm2-offset-syncs.A.internal 
(org.apache.kafka.connect.util.KafkaBasedLog:311)

[2024-12-31 20:41:02,262] INFO [MirrorCheckpointConnector|task-0] Started 
KafkaBasedLog for topic mm2-offset-syncs.A.internal 
(org.apache.kafka.connect.util.KafkaBasedLog:313)

[2024-12-31 20:41:02,263] INFO [MirrorCheckpointConnector|task-0] starting 
checkpoint and offset sync stores took 4386 ms 
(org.apache.kafka.connect.mirror.Scheduler:99)





On 2023/08/18 06:15:08 Sagar wrote:
> Hey Hudeqi,
>
> I took some time to read through the PR link as well where you and Chris
> had an informative discussion.
>
> I think even over there and in this discussion thread, it seems to me that
> the consensus is to reduce the scope of the KIP to reduce the default value
> of segment.bytes config for offsets topic. This will prevent future workers
> from having a lesser boot up time. IMO while this might not seem like a
> high impact thing, the configs that we are talking about here are advanced
> ones which new users for Connect might not immediately look into. So, if
> they end up in a situation where there's a 23-min worker startup time, then
> it might not be an overall good experience for them.
>
> Regarding the point Greg mentioned, we will have to think about getting
> around it. The approach you suggested seems unclean to me. Since you have
> been testing with this config in your cluster and you already have a large
> offsets topic, in your experience have you noticed any discrepancies of the
> in-memory states across workers in your cluster? Would it be possible for
> you to test that? That might be a good starting point to understand how we
> want to fix this. Ideally we should have some kind of a Point of view(or
> even a potential fix) on this before we go about implementing this change.
> WDYT?
>
> Thanks!
> Sagar.
>
> On Mon, Aug 14, 2023 at 6:09 PM hudeqi <16...@bjtu.edu.cn> wrote:
>
> > bump this discuss thread.
> >
> > best,
> > hudeqi
> >
> > &quot;hudeqi&quot; &lt;16120...@bjtu.edu.cn&gt;写道：
> > > Sorry for not getting email reminders and ignoring your reply for
> > getting back so late, Yash Mayya, Greg Harris, Sagar.
> > >
> > > Thank you for your thoughts and suggestions, I learned a lot, I will
> > give my thoughts and answers in a comprehensive way:
> > > 1. The default configuration of 50MB is the online configuration I
> > actually used to solve this problem, and the effect is better (see the
> > description of jira:
> > https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-15086?filter=allopenissues.
> > In fact, I think it may be better to set this value smaller, so I abandoned
> > the default value like __consumer_offsets, but I don't know how much the
> > default value is the best.). Secondly, I also set the default value of 50MB
> > online through ConfigDef#defineInternal, and if the value configured by the
> > user is greater than the default value, the warning log will be displayed,
> > but the only difference from your said is that I will overwrite the value
> > configured by the user with the default value (emmm, this point was denied
> > by Chris Egerton: https://github.com/apache/kafka/pull/13852, in fact,
> > you all agree that should not directly override the user-configured value,
> > and now I agree with this).
> > > 2. I think the potential bug that Greg mentioned may lead to
> > inconsistent state between workers is a great point. It is true that we
> > cannot directly change the configuration for an existing internal topics.
> > Perhaps a more tricky and disgusting approach is that we manually find that
> > the active segment sizes of all current partitions are relatively small,
> > first stop all connect instances, then change the topic configuration, and
> > finally start the instances.
> > >
> > > To sum up, I think whether the scope of the KIP could be reduced to:
> > only set the default value of the 'segment.bytes' of the internal topics
> > and make a warning for the bigger value configured by the user. What do you
> > think? If there's a better way I'm all ears.
> > >
> > > best,
> > > hudeqi
> >
>

RE: Re: Re: [DISCUSS] KIP-943: Add independent "offset.storage.segment.bytes" for connect-distributed.properties

Reply via email to