Hello Jason, Thank you for reply. My use case is that, first time I do full load and transformation/aggregation/joins and write to parquet (as staging) but next time onwards my source is MSSQL Server, I want to pull only those records got changed / updated and would like to update at parquet also if possible without side effects. https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/work-with-change-tracking-sql-server?view=sql-server-2017
On Tue, Apr 23, 2019 at 3:02 AM Jason Nerothin <jasonnerot...@gmail.com> wrote: > Hi Chetan, > > Do you have to use Parquet? > > It just feels like it might be the wrong sink for a high-frequency change > scenario. > > What are you trying to accomplish? > > Thanks, > Jason > > On Mon, Apr 22, 2019 at 2:09 PM Chetan Khatri <chetan.opensou...@gmail.com> > wrote: > >> Hello All, >> >> If I am doing incremental load / delta and would like to update / delete >> the records in parquet, I understands that parquet is immutable and can't >> be deleted / updated theoretically only append / overwrite can be done. But >> I can see utility tools which claims to add value for that. >> >> https://github.com/Factual/parquet-rewriter >> >> Please throw a light. >> >> Thanks >> > > > -- > Thanks, > Jason >