Hi, Michael. MySQL CDC source has the parallelism 1 when reading binlog events to keep their order. And other subtasks will stop reading data. For your question, you could set the option 'scan.incremental.close-idle-reader.enabled'='true'[1] in your cdc table to let the source close the idle subtasks. ps: There is a limit when opening this option. Please see more in its description.
Best, Hang [1] https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/connectors/flink-sources/mysql-cdc/ Michael Marino <michael.mar...@tado.com> 于2024年10月24日周四 19:13写道: > Let me quickly follow up on this: > > - I missed noting that I *was* setting the server-id value to a range. > - I just realized that if I do a hard restart and start without a > snapshot, then this works, i.e. the multiple sub-tasks receive events and > the watermarking/processing progresses. This is, however, not really ideal, > is there any way to scale CDC without this hard restart? > > Thanks, > Mike > > > On Thu, Oct 24, 2024 at 12:25 PM Michael Marino <michael.mar...@tado.com> > wrote: > >> Hey all, >> >> We are working to scale one of our Flink Jobs (using Table API mostly, >> some DataStream) where we are using a MySQL CDC table as a source for >> enrichment. >> >> What I've noticed is that, when I increase the parallelism of the job >> (e.g. to 2), the CDC table source has 2 tasks, but only one of these reads >> any events. The other one remains completely idle. This stalls downstream >> processing because we are not getting any watermarks, the only way I've >> found to get this to continue is to set table.exec.source.idle-timeout to a >> non-zero value. >> >> My questions are: >> - is there some setting I can tune to get the CDC to distribute events >> across the different sub-tasks? >> - If the above isn't possible, is there a way in the Table/SQL API to >> reduce the parallelism (e.g. to 1)? CDC doesn't seem to support >> scan.parallelism. >> >> If neither of the above works, I think I may be forced to use the >> DataStream API, set the parallelism explicitly and then convert to a table. >> >> Thanks! >> >> Cheers, >> Mike >> >> -- >> >> Michael Marino >> >> Principal Data Science & Analytics >> >> Phone: +49 89 7167786 - 14 >> >> linkedin.com/company/tadogmbh <https://www.linkedin.com/company/tadogmbh> >> | facebook.com/tado <http://www.facebook.com/tado> | twitter.com/tado >> <http://www.twitter.com/tado> | youtube.com/tado >> <http://www.youtube.com/tado> >> >> www.tado.com | tado GmbH | Sapporobogen 6-8 | 80637 Munich | Germany >> >> Managing Directors: Dr. Philip Beckmann | Christian Deilmann | Johannes >> Schwarz | Dr. Frank Siebdrat | Lukas Zyla >> >> Registered with the Commercial Register Munich as HRB 194769 | VAT-No: DE >> 280012558 >> > > > -- > > Michael Marino > > Principal Data Science & Analytics > > Phone: +49 89 7167786 - 14 > > linkedin.com/company/tadogmbh <https://www.linkedin.com/company/tadogmbh> > | facebook.com/tado <http://www.facebook.com/tado> | twitter.com/tado > <http://www.twitter.com/tado> | youtube.com/tado > <http://www.youtube.com/tado> > > www.tado.com | tado GmbH | Sapporobogen 6-8 | 80637 Munich | Germany > > Managing Directors: Dr. Philip Beckmann | Christian Deilmann | Johannes > Schwarz | Dr. Frank Siebdrat | Lukas Zyla > > Registered with the Commercial Register Munich as HRB 194769 | VAT-No: DE > 280012558 >