Let me quickly follow up on this: - I missed noting that I *was* setting the server-id value to a range. - I just realized that if I do a hard restart and start without a snapshot, then this works, i.e. the multiple sub-tasks receive events and the watermarking/processing progresses. This is, however, not really ideal, is there any way to scale CDC without this hard restart?
Thanks, Mike On Thu, Oct 24, 2024 at 12:25 PM Michael Marino <michael.mar...@tado.com> wrote: > Hey all, > > We are working to scale one of our Flink Jobs (using Table API mostly, > some DataStream) where we are using a MySQL CDC table as a source for > enrichment. > > What I've noticed is that, when I increase the parallelism of the job > (e.g. to 2), the CDC table source has 2 tasks, but only one of these reads > any events. The other one remains completely idle. This stalls downstream > processing because we are not getting any watermarks, the only way I've > found to get this to continue is to set table.exec.source.idle-timeout to a > non-zero value. > > My questions are: > - is there some setting I can tune to get the CDC to distribute events > across the different sub-tasks? > - If the above isn't possible, is there a way in the Table/SQL API to > reduce the parallelism (e.g. to 1)? CDC doesn't seem to support > scan.parallelism. > > If neither of the above works, I think I may be forced to use the > DataStream API, set the parallelism explicitly and then convert to a table. > > Thanks! > > Cheers, > Mike > > -- > > Michael Marino > > Principal Data Science & Analytics > > Phone: +49 89 7167786 - 14 > > linkedin.com/company/tadogmbh <https://www.linkedin.com/company/tadogmbh> > | facebook.com/tado <http://www.facebook.com/tado> | twitter.com/tado > <http://www.twitter.com/tado> | youtube.com/tado > <http://www.youtube.com/tado> > > www.tado.com | tado GmbH | Sapporobogen 6-8 | 80637 Munich | Germany > > Managing Directors: Dr. Philip Beckmann | Christian Deilmann | Johannes > Schwarz | Dr. Frank Siebdrat | Lukas Zyla > > Registered with the Commercial Register Munich as HRB 194769 | VAT-No: DE > 280012558 > -- Michael Marino Principal Data Science & Analytics Phone: +49 89 7167786 - 14 linkedin.com/company/tadogmbh <https://www.linkedin.com/company/tadogmbh> | facebook.com/tado <http://www.facebook.com/tado> | twitter.com/tado <http://www.twitter.com/tado> | youtube.com/tado <http://www.youtube.com/tado> www.tado.com | tado GmbH | Sapporobogen 6-8 | 80637 Munich | Germany Managing Directors: Dr. Philip Beckmann | Christian Deilmann | Johannes Schwarz | Dr. Frank Siebdrat | Lukas Zyla Registered with the Commercial Register Munich as HRB 194769 | VAT-No: DE 280012558