Manjeet, this sounds like a problem that exists outside Connect's purview. Connect has nothing to do with resource management.
Ryanne On Mon, Jul 29, 2019, 2:05 PM Manjeet Duhan <mdu...@operative.com> wrote: > Hi , > > This is Manjeet here working in operative media . I have been working on > confluent kafka for almost 4 years and have made many customized changes > for kafka connect sink and source connectors . I have made changes in kafka > code base as well for our requirement. > > There is one feature I have added recently after discussing with our > architect Praveen Manvi which I wanted to discuss with you for larger > community usage. > > Background :- We are running more than 30 connectors in the operative but > each connector require different machine specification . E.g Kafka connect > s3 requires more memory and some of the in house connector require more > network bandwidth ( IO ) and processing power (CPU) . We were getting out > of memory in worker due to one connector . This effected entire processes > and we had to pause this connector. > > Issue :- We wanted each connector to run on specific machine (in this case > , we want 3 type of machines memory , cpu and IO). > > Existing Solution :- We can start 3 cluster and have specific type of > machine in each cluster but this is difficult to manage. > Pain points :- > > 1. We have to consistently take care of cluster while starting > machine otherwise it can start in different cluster. > > 2. We have to change offset storage topic otherwise we will be able > to see across cluster connectors > > Issue Proposed :- We specify type of machine in distributed properties of > each worker machine so that when we specify target machine type in > connector start , It should be able to start task on exactly same type of > machines. In this case we don't have to take care of above pain points . > Different type of machine will be part of same cluster. > > Example :- I have 4 workers with type as memory (worker 1), cpu (worker 2) > and IO (worker3 and worker 4 ). > > > a) We started connector 1 with 2 tasks and specified target machine > type as cpu. It will distribute tasks equally on worker 3 and worker 4. > > b) We started connector 2 with 2 task with target machine type as > memory . It will start both task on worker 1. > > I have made changes for this feature and it is working fine and we are > pushing to our production cluster in few days. > > Please tell if it can be helpful for the larger community. > > > Thanks, > Manjeet Duhan > > >