Re: CDC with "copy on write"

2020-12-22 Thread Ryan Blue
Hi Ashish, I think it makes sense for your use case (#1949) to expose a way to read overwrites using incremental scans, but I'm not sure how best to expose it. This is safe for your case because you know which records were deleted based on the ID, so you're basically replaying the data as incremen

CDC with "copy on write"

2020-12-18 Thread Ashish Mehta
Hi, We have been working to support Row-level updates to our data ingestion pipeline using Spark on Batch API (no streaming use case for now). Currently, we are looking to adhere to the "copy on write" implementation of DELETE, MERGE INTO (WIP via #1947). For us, our main use case is primary key