+1 (non-binding)

On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <[email protected]>
wrote:

> +1
>
> Dr Mich Talebzadeh,
> Data Scientist | Distributed Systems (Spark) | Financial Forensics &
> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based
> Analytics
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>
>
> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected]> wrote:
>
>> Hi Spark devs,
>>
>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC) Support*
>>
>> *Summary:*
>>
>> This SPIP proposes a unified approach by adding a CHANGES SQL clause and
>> corresponding DataFrame/DataStream APIs that work across DSv2 connectors.
>>
>> 1. Standardized User API
>>
>> SQL:
>>
>> -- Batch: What changed between version 10 and 20?
>>
>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20;
>>
>> -- Streaming: Continuously process changes
>>
>> CREATE STREAMING TABLE cdc_sink AS
>>
>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0;
>>
>> DataFrame API:
>>
>> spark.read
>>
>>   .option("startingVersion", "10")
>>
>>   .option("endingVersion", "20")
>>
>>   .changes("my_table")
>>
>> 2. Engine-Level Post Processing Under the hood, this proposal introduces
>> a minimal Changelog interface for DSv2 connectors. Spark's Catalyst
>> optimizer will take over the CDC post-processing, including:
>>
>>    -
>>
>>    Filtering out copy-on-write carry-over rows.
>>    -
>>
>>    Deriving pre-image/post-image updates from raw insert/delete pairs.
>>    -
>>
>>    Computing net changes.
>>
>>
>> *Relevant Links:*
>>
>>    - *SPIP Doc: *
>>    
>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing
>>    - *Discuss Thread: *
>>    https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts
>>    - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668
>>
>>
>> *The vote will be open for at least 72 hours. *Please vote:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>>
>> [ ] +0
>>
>> [ ] -1: I don't think this is a good idea because ...
>>
>> Thanks,
>> Gengliang Wang
>>
>

Reply via email to