> I'm not sure there's an upgrade path before Spark 4.0. Any ideas? We can at least separate the concerns. We can remove the runtime modules that are the main issue. If we compile against an older version of the Hive metastore module (leaving it unchanged) that at least has a dramatically reduced surface area for Java version issues. As long as the API is compatible (and we haven't heard complaints that it is not) then I think users can override the version in their environments.
Ryan On Sun, Dec 15, 2024 at 5:55 PM Manu Zhang <owenzhang1...@gmail.com> wrote: > Hi Daniel, > I'll start a vote once I get the PR ready. > > Hi Ryan, > Sorry, I wasn't clear in the last email that the consensus is to upgrade > Hive metastore support. > > Well, I was too optimistic about the upgrade. Spark has only added hive > 4.0 metastore support recently for Spark 4.0[1] and there will be conflicts > between Spark's hive 2.3.9 and our hive 4.0 dependencies. > I'm not sure there's an upgrade path before Spark 4.0. Any ideas? > > 1. https://issues.apache.org/jira/browse/SPARK-45265 > > Thanks, > Manu > > > On Sat, Dec 14, 2024 at 4:31 AM rdb...@gmail.com <rdb...@gmail.com> wrote: > >> Oh, I think I see. The upgrade to Hive 4 is just for the Hive metastore >> support? When I read the thread, I thought that we weren't going to change >> the metastore. That seems reasonable to me. Sorry for the confusion. >> >> On Fri, Dec 13, 2024 at 10:24 AM rdb...@gmail.com <rdb...@gmail.com> >> wrote: >> >>> Sorry, I must have missed something. I don't think that we should >>> upgrade anything in Iceberg to Hive 4. Why not simply remove the Hive >>> support entirely? Why would anyone need Hive 4 support from Iceberg when it >>> is built into Hive 4? >>> >>> On Thu, Dec 12, 2024 at 11:03 AM Daniel Weeks <dwe...@apache.org> wrote: >>> >>>> Hey Manu, >>>> >>>> I agree with the direction here, but we should probably hold a quick >>>> procedural vote just to confirm since this is a significant change in >>>> support for Hive. >>>> >>>> -Dan >>>> >>>> On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang <owenzhang1...@gmail.com> >>>> wrote: >>>> >>>>> Thanks all for sharing your thoughts. It looks there's a consensus on >>>>> upgrading to Hive 4 and dropping hive-runtime. >>>>> I've submitted a PR[1] as the first step. Please help review. >>>>> >>>>> 1. https://github.com/apache/iceberg/pull/11750 >>>>> >>>>> Thanks, >>>>> Manu >>>>> >>>>> On Thu, Nov 28, 2024 at 11:26 PM Shohei Okumiya <oku...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I also prefer option 1. I have some initiatives[1] to improve >>>>>> integrations between Hive and Iceberg. The current style allows us to >>>>>> develop both Hive's core and HiveIcebergStorageHandler simultaneously. >>>>>> That would help us enhance integrations. >>>>>> >>>>>> - [1] https://issues.apache.org/jira/browse/HIVE-28410 >>>>>> >>>>>> Regards, >>>>>> Okumin >>>>>> >>>>>> On Thu, Nov 28, 2024 at 4:17 AM Fokko Driesprong <fo...@apache.org> >>>>>> wrote: >>>>>> > >>>>>> > Hey Cheng, >>>>>> > >>>>>> > Thanks for the suggestion. The nightly snapshots are available: >>>>>> https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/, >>>>>> which might help when working on features that are not released yet (eg >>>>>> Nanosecond timestamps). Besides that, we should run RCs against Hive to >>>>>> check if everything works as expected. >>>>>> > >>>>>> > I'm leaning toward removing Hive 2 and 3 as well. >>>>>> > >>>>>> > Kind regards, >>>>>> > Fokko >>>>>> > >>>>>> > Op wo 27 nov 2024 om 20:05 schreef rdb...@gmail.com < >>>>>> rdb...@gmail.com>: >>>>>> >> >>>>>> >> I think that we should remove Hive 2 and Hive 3. We already agreed >>>>>> to remove Hive 2, but Hive 3 is not compatible with the project anymore >>>>>> and >>>>>> is already EOL and will not see a release to update it so that it can be >>>>>> compatible. Anyone using the existing Hive 3 support should be able to >>>>>> continue using older releases. >>>>>> >> >>>>>> >> In general, I think it's a good idea to let people use older >>>>>> releases when these situations happen. It is difficult for the project to >>>>>> continue to support libraries that are EOL and I don't think there's a >>>>>> great justification for it, considering Iceberg support in Hive 4 is >>>>>> native >>>>>> and much better! >>>>>> >> >>>>>> >> On Wed, Nov 27, 2024 at 7:12 AM Cheng Pan <pan3...@gmail.com> >>>>>> wrote: >>>>>> >>> >>>>>> >>> That said, it would be helpful if they continue running >>>>>> >>> tests against the latest stable Hive releases to ensure that any >>>>>> >>> changes don’t unintentionally break something for Hive, which >>>>>> would be >>>>>> >>> beyond our control. >>>>>> >>> >>>>>> >>> >>>>>> >>> I believe we should continue maintaining a Hive Iceberg runtime >>>>>> test suite with the latest version of Hive in the Iceberg repository. >>>>>> >>> >>>>>> >>> >>>>>> >>> i think we can keep some basic Hive4 tests in iceberg repo >>>>>> >>> >>>>>> >>> >>>>>> >>> Instead of running basic tests on the Iceberg repo, maybe let >>>>>> Iceberg publish daily snapshot jars to Nexus, and have a daily CI in Hive >>>>>> to consume those jars and run full Iceberg tests makes more sense? >>>>>> >>> >>>>>> >>> Thanks, >>>>>> >>> Cheng Pan >>>>>> >>> >>>>>> >>>>>