Hi Jake, This is an issue across all RDBMs including Oracle etc. When you are updating you have to commit or roll back in RDBMS itself and I am not aware of Spark doing that.
The staging table is a safer method as it follows ETL type approach. You create new data in the staging table in RDBMS and do the DML in the RDBMS itself where you can control commit or rollback. That is the way I would do it. A simple shell script can do both. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 21 August 2017 at 15:50, Jake Russ <jr...@bloomintelligence.com> wrote: > Hi everyone, > > > > I’m currently using SparkR to read data from a MySQL database, perform > some calculations, and then write the results back to MySQL. Is it still > true that Spark does not support UPDATE queries via JDBC? I’ve seen many > posts on the internet that Spark’s DataFrameWriter does not support > UPDATE queries via JDBC > <https://issues.apache.org/jira/browse/SPARK-19335>. It will only > “append” or “overwrite” to existing tables. The best advice I’ve found so > far, for performing this update, is to write to a staging table in MySQL > <https://stackoverflow.com/questions/34643200/spark-dataframes-upsert-to-postgres-table> > and > then perform the UPDATE query on the MySQL side. > > > > Ideally, I’d like to handle the update during the write operation. Has > anyone else encountered this limitation and have a better solution? > > > > Thank you, > > > > Jake >