Yup! This definitely helps and makes sense. The 'after' payload comes with all data from the row right? So essentially inserts and updates I can insert/replace data by pk and null values I just delete by pk, and then I can build out the rest of my joins like normal.
Are there any performance implications of doing it this way that is different from the out-of-the-box 1.11 solution? On Fri, Aug 21, 2020 at 2:28 AM Marta Paes Moreira <ma...@ververica.com> wrote: > Hi, Rex. > > Part of what enabled CDC support in Flink 1.11 was the refactoring of the > table source interfaces (FLIP-95 [1]), and the new ScanTableSource > [2], which allows to emit bounded/unbounded streams with insert, update and > delete rows. > > In theory, you could consume data generated with Debezium as regular > JSON-encoded events before Flink 1.11 — there just wasn't a convenient way > to really treat it as "changelog". As a workaround, what you can do in > Flink 1.10 is process these messages as JSON and extract the "after" field > from the payload, and then apply de-duplication [3] to keep only the last > row. > > The DDL for your source table would look something like: > > CREATE TABLE tablename ( *... * after ROW(`field1` DATATYPE, `field2` > DATATYPE, ...) ) WITH ( 'connector' = 'kafka', 'format' = 'json', ... ); > Hope this helps! > > Marta > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.11/api/java/org/apache/flink/table/connector/source/ScanTableSource.html > [3] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/sql/queries.html#deduplication > > > On Fri, Aug 21, 2020 at 10:28 AM Chesnay Schepler <ches...@apache.org> > wrote: > >> @Jark Would it be possible to use the 1.11 debezium support in 1.10? >> >> On 20/08/2020 19:59, Rex Fenley wrote: >> >> Hi, >> >> I'm trying to set up Flink with Debezium CDC Connector on AWS EMR, >> however, EMR only supports Flink 1.10.0, whereas Debezium Connector arrived >> in Flink 1.11.0, from looking at the documentation. >> >> https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-flink.html >> >> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/formats/debezium.html >> >> I'm wondering what alternative solutions are available for connecting >> Debezium to Flink? Is there an open source Debezium connector that works >> with Flink 1.10.0? Could I potentially pull the code out for the 1.11.0 >> Debezium connector and compile it in my project using Flink 1.10.0 api? >> >> For context, I plan on doing some fairly complicated long lived stateful >> joins / materialization using the Table API over data ingested from >> Postgres and possibly MySQL. >> >> Appreciate any help, thanks! >> >> -- >> >> Rex Fenley | Software Engineer - Mobile and Backend >> >> >> Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> >> | FOLLOW US <https://twitter.com/remindhq> | LIKE US >> <https://www.facebook.com/remindhq> >> >> >> -- Rex Fenley | Software Engineer - Mobile and Backend Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | FOLLOW US <https://twitter.com/remindhq> | LIKE US <https://www.facebook.com/remindhq>