Thank you very much. I understand the performance implications and that Spark will download it before modifying. The JDBC database is just extremely small, it’s the BI/aggregated layer.
What’s interesting is that here it says I can use JDBC https://spark.apache.org/docs/3.3.1/sql-ref-syntax-dml-insert-overwrite-directory.html But when I try to I get an error that the underlying datastore should be file based, I guess a documentation mistake. Thank you one more time. > On 2 Feb 2023, at 23:11, Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > > Please bear in mind that insert/update delete operations are DML, whereas > CREATE/DROP TABLE are DDL operations that are best performed in the native > database which I presume is a transactional. > > Can you CREATE TABLE before (any insert of data) using the native JDBC > database syntax? > > Alternatively you may be able to do so in Python or SCALA but I don't know > the way in pure SQL. > > if your JDBC database is Hive you can do so easily > > HTH > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > https://en.everybodywiki.com/Mich_Talebzadeh > > > Disclaimer: Use it at your own risk. Any and all responsibility for any loss, > damage or destruction of data or any other property which may arise from > relying on this email's technical content is explicitly disclaimed. The > author will in no case be liable for any monetary damages arising from such > loss, damage or destruction. > > > > On Thu, 2 Feb 2023 at 17:26, Harut Martirosyan <harut.martiros...@gmail.com > <mailto:harut.martiros...@gmail.com>> wrote: >> Generally, the problem is that I don’t find a way to automatically create a >> JDBC table in the JDBC database when I want to insert data into it using >> Spark SQL only, not DataFrames API. >> >>> On 2 Feb 2023, at 21:22, Harut Martirosyan <harut.martiros...@gmail.com >>> <mailto:harut.martiros...@gmail.com>> wrote: >>> >>> Hi, thanks for the reply. >>> >>> Let’s imagine we have a parquet based table called parquet_table, now I >>> want to insert it into a new JDBC table, all using pure SQL. >>> >>> If the JDBC table already exists, it’s easy, we do CREATE TABLE USING JDBC >>> and then we do INSERT INTO that table. >>> >>> If the table doesn’t exist, is there a way to create it using Spark SQL >>> only? I don’t want to use DataFrames API, I know that I can use .write() >>> for that, but I want to keep it in pure SQL, since that is more >>> comprehendible for data analysts. >>> >>>> On 2 Feb 2023, at 02:08, Mich Talebzadeh <mich.talebza...@gmail.com >>>> <mailto:mich.talebza...@gmail.com>> wrote: >>>> >>>> Hi, >>>> >>>> It is not very clear your statement below: >>>> >>>> ".. If the table existed, I would create a table using JDBC in spark SQL >>>> and then insert into it, but I can't create a table if it doesn't exist in >>>> JDBC database..." >>>> >>>> If the table exists in your JDBC database, why do you need to create it? >>>> >>>> How do you verify if it exists? Can you share the code and the doc link? >>>> >>>> HTH >>>> >>>> >>>> >>>> view my Linkedin profile >>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>> >>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>> >>>> >>>> Disclaimer: Use it at your own risk. Any and all responsibility for any >>>> loss, damage or destruction of data or any other property which may arise >>>> from relying on this email's technical content is explicitly disclaimed. >>>> The author will in no case be liable for any monetary damages arising from >>>> such loss, damage or destruction. >>>> >>>> >>>> >>>> On Wed, 1 Feb 2023 at 19:33, Harut Martirosyan >>>> <harut.martiros...@gmail.com <mailto:harut.martiros...@gmail.com>> wrote: >>>>> I have a resultset (defined in SQL), and I want to insert it into my JDBC >>>>> database using only SQL, not dataframes API. >>>>> >>>>> If the table existed, I would create a table using JDBC in spark SQL and >>>>> then insert into it, but I can't create a table if it doesn't exist in >>>>> JDBC database. >>>>> >>>>> How to do that using pure SQL (no python/scala/java)? >>>>> >>>>> I am trying to use INSERT OVERWRITE DIRECTORY with JDBC file format >>>>> (according to the documentation) but as expected this functionality is >>>>> available only for File-based storage systems. >>>>> >>>>> -- >>>>> RGRDZ Harut >>> >>