@Mich Talebzadeh <mich.talebza...@gmail.com> there seems to be a misunderstanding here. The Spark native data source table is still stored in the Hive metastore, it's just that Spark will use a different (and faster) reader/writer for it. `hive-site.xml` should work as it is today.
On Tue, Apr 30, 2024 at 5:23 AM Hyukjin Kwon <gurwls...@apache.org> wrote: > +1 > > It's a legacy conf that we should eventually remove it away. Spark should > create Spark table by default, not Hive table. > > Mich, for your workload, you can simply switch that conf off if it > concerns you. We also enabled ANSI as well (that you agreed on). It's a bit > akwakrd to stop in the middle for this compatibility reason during making > Spark sound. The compatibility has been tested in production for a long > time so I don't see any particular issue about the compatibility case you > mentioned. > > On Mon, Apr 29, 2024 at 2:08 AM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> >> Hi @Wenchen Fan <cloud0...@gmail.com> >> >> Thanks for your response. I believe we have not had enough time to >> "DISCUSS" this matter. >> >> Currently in order to make Spark take advantage of Hive, I create a soft >> link in $SPARK_HOME/conf. FYI, my spark version is 3.4.0 and Hive is 3.1.1 >> >> /opt/spark/conf/hive-site.xml -> >> /data6/hduser/hive-3.1.1/conf/hive-site.xml >> >> This works fine for me in my lab. So in the future if we opt to use the >> setting "spark.sql.legacy.createHiveTableByDefault" to False, there will >> not be a need for this logical link.? >> On the face of it, this looks fine but in real life it may require a >> number of changes to the old scripts. Hence my concern. >> As a matter of interest has anyone liaised with the Hive team to ensure >> they have introduced the additional changes you outlined? >> >> HTH >> >> Mich Talebzadeh, >> Technologist | Architect | Data Engineer | Generative AI | FinCrime >> London >> United Kingdom >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> https://en.everybodywiki.com/Mich_Talebzadeh >> >> >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand >> expert opinions (Werner >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun >> <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". >> >> >> On Sun, 28 Apr 2024 at 09:34, Wenchen Fan <cloud0...@gmail.com> wrote: >> >>> @Mich Talebzadeh <mich.talebza...@gmail.com> thanks for sharing your >>> concern! >>> >>> Note: creating Spark native data source tables is usually Hive >>> compatible as well, unless we use features that Hive does not support >>> (TIMESTAMP NTZ, ANSI INTERVAL, etc.). I think it's a better default to >>> create Spark native table in this case, instead of creating Hive table and >>> fail. >>> >>> On Sat, Apr 27, 2024 at 12:46 PM Cheng Pan <pan3...@gmail.com> wrote: >>> >>>> +1 (non-binding) >>>> >>>> Thanks, >>>> Cheng Pan >>>> >>>> On Sat, Apr 27, 2024 at 9:29 AM Holden Karau <holden.ka...@gmail.com> >>>> wrote: >>>> > >>>> > +1 >>>> > >>>> > Twitter: https://twitter.com/holdenkarau >>>> > Books (Learning Spark, High Performance Spark, etc.): >>>> https://amzn.to/2MaRAG9 >>>> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>>> > >>>> > >>>> > On Fri, Apr 26, 2024 at 12:06 PM L. C. Hsieh <vii...@gmail.com> >>>> wrote: >>>> >> >>>> >> +1 >>>> >> >>>> >> On Fri, Apr 26, 2024 at 10:01 AM Dongjoon Hyun <dongj...@apache.org> >>>> wrote: >>>> >> > >>>> >> > I'll start with my +1. >>>> >> > >>>> >> > Dongjoon. >>>> >> > >>>> >> > On 2024/04/26 16:45:51 Dongjoon Hyun wrote: >>>> >> > > Please vote on SPARK-46122 to set >>>> spark.sql.legacy.createHiveTableByDefault >>>> >> > > to `false` by default. The technical scope is defined in the >>>> following PR. >>>> >> > > >>>> >> > > - DISCUSSION: >>>> >> > > https://lists.apache.org/thread/ylk96fg4lvn6klxhj6t6yh42lyqb8wmd >>>> >> > > - JIRA: https://issues.apache.org/jira/browse/SPARK-46122 >>>> >> > > - PR: https://github.com/apache/spark/pull/46207 >>>> >> > > >>>> >> > > The vote is open until April 30th 1AM (PST) and passes >>>> >> > > if a majority +1 PMC votes are cast, with a minimum of 3 +1 >>>> votes. >>>> >> > > >>>> >> > > [ ] +1 Set spark.sql.legacy.createHiveTableByDefault to false by >>>> default >>>> >> > > [ ] -1 Do not change spark.sql.legacy.createHiveTableByDefault >>>> because ... >>>> >> > > >>>> >> > > Thank you in advance. >>>> >> > > >>>> >> > > Dongjoon >>>> >> > > >>>> >> > >>>> >> > >>>> --------------------------------------------------------------------- >>>> >> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >> > >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >>>>