+1, look forward to it (non binding) Thanks Szehon
On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <[email protected]> wrote: > +1 (non-binding) > > On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <[email protected]> > wrote: > >> +1 >> >> Dr Mich Talebzadeh, >> Data Scientist | Distributed Systems (Spark) | Financial Forensics & >> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based >> Analytics >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> >> >> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected]> wrote: >> >>> Hi Spark devs, >>> >>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC) Support* >>> >>> *Summary:* >>> >>> This SPIP proposes a unified approach by adding a CHANGES SQL clause >>> and corresponding DataFrame/DataStream APIs that work across DSv2 >>> connectors. >>> >>> 1. Standardized User API >>> >>> SQL: >>> >>> -- Batch: What changed between version 10 and 20? >>> >>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20; >>> >>> -- Streaming: Continuously process changes >>> >>> CREATE STREAMING TABLE cdc_sink AS >>> >>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0; >>> >>> DataFrame API: >>> >>> spark.read >>> >>> .option("startingVersion", "10") >>> >>> .option("endingVersion", "20") >>> >>> .changes("my_table") >>> >>> 2. Engine-Level Post Processing Under the hood, this proposal >>> introduces a minimal Changelog interface for DSv2 connectors. Spark's >>> Catalyst optimizer will take over the CDC post-processing, including: >>> >>> - >>> >>> Filtering out copy-on-write carry-over rows. >>> - >>> >>> Deriving pre-image/post-image updates from raw insert/delete pairs. >>> - >>> >>> Computing net changes. >>> >>> >>> *Relevant Links:* >>> >>> - *SPIP Doc: * >>> >>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing >>> - *Discuss Thread: * >>> https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts >>> - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668 >>> >>> >>> *The vote will be open for at least 72 hours. *Please vote: >>> >>> [ ] +1: Accept the proposal as an official SPIP >>> >>> [ ] +0 >>> >>> [ ] -1: I don't think this is a good idea because ... >>> >>> Thanks, >>> Gengliang Wang >>> >>
