+1 (non-binding) Thanks for the contribution!
On Tue, Mar 3, 2026 at 5:50 PM Burak Yavuz <[email protected]> wrote: > +1! > > On Tue, Mar 3, 2026 at 5:48 PM Szehon Ho <[email protected]> wrote: > >> +1, look forward to it (non binding) >> >> Thanks >> Szehon >> >> On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <[email protected]> >> wrote: >> >>> +1 (non-binding) >>> >>> On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh < >>> [email protected]> wrote: >>> >>>> +1 >>>> >>>> Dr Mich Talebzadeh, >>>> Data Scientist | Distributed Systems (Spark) | Financial Forensics & >>>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based >>>> Analytics >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> >>>> >>>> >>>> >>>> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected]> wrote: >>>> >>>>> Hi Spark devs, >>>>> >>>>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC) >>>>> Support* >>>>> >>>>> *Summary:* >>>>> >>>>> This SPIP proposes a unified approach by adding a CHANGES SQL clause >>>>> and corresponding DataFrame/DataStream APIs that work across DSv2 >>>>> connectors. >>>>> >>>>> 1. Standardized User API >>>>> >>>>> SQL: >>>>> >>>>> -- Batch: What changed between version 10 and 20? >>>>> >>>>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20; >>>>> >>>>> -- Streaming: Continuously process changes >>>>> >>>>> CREATE STREAMING TABLE cdc_sink AS >>>>> >>>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0; >>>>> >>>>> DataFrame API: >>>>> >>>>> spark.read >>>>> >>>>> .option("startingVersion", "10") >>>>> >>>>> .option("endingVersion", "20") >>>>> >>>>> .changes("my_table") >>>>> >>>>> 2. Engine-Level Post Processing Under the hood, this proposal >>>>> introduces a minimal Changelog interface for DSv2 connectors. Spark's >>>>> Catalyst optimizer will take over the CDC post-processing, including: >>>>> >>>>> - >>>>> >>>>> Filtering out copy-on-write carry-over rows. >>>>> - >>>>> >>>>> Deriving pre-image/post-image updates from raw insert/delete pairs. >>>>> - >>>>> >>>>> Computing net changes. >>>>> >>>>> >>>>> *Relevant Links:* >>>>> >>>>> - *SPIP Doc: * >>>>> >>>>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing >>>>> - *Discuss Thread: * >>>>> https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts >>>>> - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668 >>>>> >>>>> >>>>> *The vote will be open for at least 72 hours. *Please vote: >>>>> >>>>> [ ] +1: Accept the proposal as an official SPIP >>>>> >>>>> [ ] +0 >>>>> >>>>> [ ] -1: I don't think this is a good idea because ... >>>>> >>>>> Thanks, >>>>> Gengliang Wang >>>>> >>>> -- John Zhuge
