+1, look forward to it (non binding)

Thanks
Szehon

On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <[email protected]>
wrote:

> +1 (non-binding)
>
> On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <[email protected]>
> wrote:
>
>> +1
>>
>> Dr Mich Talebzadeh,
>> Data Scientist | Distributed Systems (Spark) | Financial Forensics &
>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based
>> Analytics
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>>
>>
>> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected]> wrote:
>>
>>> Hi Spark devs,
>>>
>>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC) Support*
>>>
>>> *Summary:*
>>>
>>> This SPIP proposes a unified approach by adding a CHANGES SQL clause
>>> and corresponding DataFrame/DataStream APIs that work across DSv2
>>> connectors.
>>>
>>> 1. Standardized User API
>>>
>>> SQL:
>>>
>>> -- Batch: What changed between version 10 and 20?
>>>
>>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20;
>>>
>>> -- Streaming: Continuously process changes
>>>
>>> CREATE STREAMING TABLE cdc_sink AS
>>>
>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0;
>>>
>>> DataFrame API:
>>>
>>> spark.read
>>>
>>>   .option("startingVersion", "10")
>>>
>>>   .option("endingVersion", "20")
>>>
>>>   .changes("my_table")
>>>
>>> 2. Engine-Level Post Processing Under the hood, this proposal
>>> introduces a minimal Changelog interface for DSv2 connectors. Spark's
>>> Catalyst optimizer will take over the CDC post-processing, including:
>>>
>>>    -
>>>
>>>    Filtering out copy-on-write carry-over rows.
>>>    -
>>>
>>>    Deriving pre-image/post-image updates from raw insert/delete pairs.
>>>    -
>>>
>>>    Computing net changes.
>>>
>>>
>>> *Relevant Links:*
>>>
>>>    - *SPIP Doc: *
>>>    
>>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing
>>>    - *Discuss Thread: *
>>>    https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts
>>>    - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668
>>>
>>>
>>> *The vote will be open for at least 72 hours. *Please vote:
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>>
>>> [ ] +0
>>>
>>> [ ] -1: I don't think this is a good idea because ...
>>>
>>> Thanks,
>>> Gengliang Wang
>>>
>>

Reply via email to