As far as I know, Flink pipeline connector has the following benefits: 1. User-friendly: * Schema inference: you don't need write schema in the yaml file, the framework will convert the data type for users. * Yaml is much easier for users to use comparing to SQL. Many external system can use yaml to build a
2. Enterprise-level features: * Schema evolution: if the upstream table add a new column, yaml job supports to update the downstream table's schema. * Full DB sync: you can use a job to sync all tables in the upstream database to downstream. In SQL, you needs write multiple statements to sync every tables. Best, Shengkai Andrew Otto <o...@wikimedia.org> 于2024年12月3日周二 02:13写道: > Hi Robin! > > IIUC, the difference is: > > > - Pipeline connectors can be used as a fully contained yaml configured > CDC pipeline job > > <https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/core-concept/data-pipeline/> > - Flink CDC sources are Flink Table connectors that can connect > directly to source database tables and binlogs. They allow you to use > Flink SQL / Table API to query external source databases. They are used > internally by pipelines. E.g. The mysql-cdc connector is used by a source > type: mysql pipeline connector. > > > > is the point that Flink CDC provides CDC connectors, and they are > documented here > <https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/connectors/flink-sources/overview/> > when > they could as logically be documented here > <https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/overview/> > under > the main Flink docs? > > Flink CDC connectors are Flink Table connectors, but specifically for > doing CDC. Compare that to e.g. the Flink JDBC table connector > <https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/jdbc/>, > which allows you to query a MySQL table with Flink, but won't read changes > in a streaming fashion. (IIUC, that is why the JDBC docs have a "Scan > Source: Bounded" heading) > > I'm not an expert though, so please someone correct me if I am wrong! > > > > > > On Mon, Dec 2, 2024 at 12:52 PM Robin Moffatt via user < > user@flink.apache.org> wrote: > >> I'm struggling to grok the difference between pipeline connectors >> <https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/connectors/pipeline-connectors/overview/> >> and Flink sources >> <https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/connectors/flink-sources/overview/> >> in >> Flink CDC. >> >> I understand pipeline connectors, and have been through the quickstart >> and they make sense. >> >> But how are Flink sources any different from what I'd build in Flink SQL >> itself directly? How do they fit into Flink CDC? Or is the point that Flink >> CDC provides CDC connectors, and they are documented here >> <https://nightlies.apache.org/flink/flink-cdc-docs-release-3.2/docs/connectors/flink-sources/overview/> >> when they could as logically be documented here >> <https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/overview/> >> under >> the main Flink docs? >> >> Thanks in advance, >> Robin >> >