subject:"CDC with \"copy on write\""

Re: CDC with "copy on write"

2020-12-22 Thread Ryan Blue

Hi Ashish, I think it makes sense for your use case (#1949) to expose a way to read overwrites using incremental scans, but I'm not sure how best to expose it. This is safe for your case because you know which records were deleted based on the ID, so you're basically replaying the data as incremen

CDC with "copy on write"

2020-12-18 Thread Ashish Mehta

Hi, We have been working to support Row-level updates to our data ingestion pipeline using Spark on Batch API (no streaming use case for now). Currently, we are looking to adhere to the "copy on write" implementation of DELETE, MERGE INTO (WIP via #1947). For us, our main use case is primary key