Re: How can I use pyspark to upsert one row without replacing entire table

ed elliott Wed, 12 Aug 2020 14:56:24 -0700

You’ll need to do an insert and use a trigger on the table to change it into an 
upsert, also make sure your mode is append rather than overwrite.


Ed

________________________________
From: Siavash Namvar <sns...@gmail.com>
Sent: Wednesday, August 12, 2020 4:09:07 PM
To: Sean Owen <sro...@gmail.com>
Cc: User <user@spark.apache.org>
Subject: Re: How can I use pyspark to upsert one row without replacing entire 
table

Thanks Sean,

Do you have any URL or reference to help me how to upsert in Spark? I need to 
update Sybase db

On Wed, Aug 12, 2020 at 11:06 AM Sean Owen 
<sro...@gmail.com<mailto:sro...@gmail.com>> wrote:
It's not so much Spark but the data format, whether it supports
upserts. Parquet, CSV, JSON, etc would not.
That is what Delta, Hudi et al are for, and yes you can upsert them in Spark.

On Wed, Aug 12, 2020 at 9:57 AM Siavash Namvar 
<sns...@gmail.com<mailto:sns...@gmail.com>> wrote:
>
> Hi,
>
> I have a use case, and read data from a db table and need to update few rows 
> based on primary key without replacing the entire table.
>
> for instance if I have 3 following rows
>
> -------------------
> id | fname
> -------------------
>  1 | john
> -------------------
>  2 | Steve
> -------------------
>  3 | Jack
> -------------------
>
> And I would like to update the row with id=2 from Steve to Michael without 
> replacing the entire table and the outpur looks like
>
> -------------------
> id | fname
> -------------------
>  1 | john
> -------------------
>  2 | Michael
> -------------------
>  3 | Jack
> -------------------
>
> Keep in mind the actual db table is so huge and database is old and cannot 
> read and replace entire table
>
> Thanks

Re: How can I use pyspark to upsert one row without replacing entire table

Reply via email to