Hi Jake,

This is an issue across all RDBMs including Oracle etc. When you are
updating you have to commit or roll back in RDBMS itself and I am not aware
of Spark doing that.

The staging table is a safer method as it follows ETL type approach. You
create new data in the staging table in RDBMS and do the DML in the RDBMS
itself where you can control commit or rollback. That is the way I would do
it. A simple shell script can do both.

HTH



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 21 August 2017 at 15:50, Jake Russ <jr...@bloomintelligence.com> wrote:

> Hi everyone,
>
>
>
> I’m currently using SparkR to read data from a MySQL database, perform
> some calculations, and then write the results back to MySQL. Is it still
> true that Spark does not support UPDATE queries via JDBC? I’ve seen many
> posts on the internet that Spark’s DataFrameWriter does not support
> UPDATE queries via JDBC
> <https://issues.apache.org/jira/browse/SPARK-19335>. It will only
> “append” or “overwrite” to existing tables. The best advice I’ve found so
> far, for performing this update, is to write to a staging table in MySQL
> <https://stackoverflow.com/questions/34643200/spark-dataframes-upsert-to-postgres-table>
>  and
> then perform the UPDATE query on the MySQL side.
>
>
>
> Ideally, I’d like to handle the update during the write operation. Has
> anyone else encountered this limitation and have a better solution?
>
>
>
> Thank you,
>
>
>
> Jake
>

Reply via email to