+1 (non-binding) Thanks, Cheng Pan
> On Mar 4, 2026, at 09:59, John Zhuge <[email protected]> wrote: > > +1 (non-binding) > > Thanks for the contribution! > > > On Tue, Mar 3, 2026 at 5:50 PM Burak Yavuz <[email protected] > <mailto:[email protected]>> wrote: >> +1! >> >> On Tue, Mar 3, 2026 at 5:48 PM Szehon Ho <[email protected] >> <mailto:[email protected]>> wrote: >>> +1, look forward to it (non binding) >>> >>> Thanks >>> Szehon >>> >>> On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <[email protected] >>> <mailto:[email protected]>> wrote: >>>> +1 (non-binding) >>>> >>>> On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <[email protected] >>>> <mailto:[email protected]>> wrote: >>>>> +1 >>>>> >>>>> Dr Mich Talebzadeh, >>>>> Data Scientist | Distributed Systems (Spark) | Financial Forensics & >>>>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based >>>>> Analytics >>>>> >>>>> view my Linkedin profile >>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>>> Hi Spark devs, >>>>>> >>>>>> I'd like to call a vote on the SPIP: Change Data Capture (CDC) Support >>>>>> >>>>>> Summary: >>>>>> >>>>>> This SPIP proposes a unified approach by adding a CHANGES SQL clause and >>>>>> corresponding DataFrame/DataStream APIs that work across DSv2 connectors. >>>>>> >>>>>> 1. Standardized User API >>>>>> >>>>>> SQL: >>>>>> -- Batch: What changed between version 10 and 20? >>>>>> >>>>>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20; >>>>>> -- Streaming: Continuously process changes >>>>>> >>>>>> CREATE STREAMING TABLE cdc_sink AS >>>>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0; >>>>>> DataFrame API: >>>>>> >>>>>> spark.read >>>>>> .option("startingVersion", "10") >>>>>> .option("endingVersion", "20") >>>>>> .changes("my_table") >>>>>> 2. Engine-Level Post Processing Under the hood, this proposal introduces >>>>>> a minimal Changelog interface for DSv2 connectors. Spark's Catalyst >>>>>> optimizer will take over the CDC post-processing, including: >>>>>> >>>>>> Filtering out copy-on-write carry-over rows. >>>>>> Deriving pre-image/post-image updates from raw insert/delete pairs. >>>>>> Computing net changes. >>>>>> >>>>>> >>>>>> Relevant Links: >>>>>> SPIP Doc: >>>>>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing >>>>>> Discuss Thread: >>>>>> https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts >>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-55668 >>>>>> >>>>>> The vote will be open for at least 72 hours. Please vote: >>>>>> >>>>>> [ ] +1: Accept the proposal as an official SPIP >>>>>> >>>>>> [ ] +0 >>>>>> >>>>>> [ ] -1: I don't think this is a good idea because ... >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Gengliang Wang > > > > -- > John Zhuge
