Hi Daniel, I'll start a vote once I get the PR ready. Hi Ryan, Sorry, I wasn't clear in the last email that the consensus is to upgrade Hive metastore support.
Well, I was too optimistic about the upgrade. Spark has only added hive 4.0 metastore support recently for Spark 4.0[1] and there will be conflicts between Spark's hive 2.3.9 and our hive 4.0 dependencies. I'm not sure there's an upgrade path before Spark 4.0. Any ideas? 1. https://issues.apache.org/jira/browse/SPARK-45265 Thanks, Manu On Sat, Dec 14, 2024 at 4:31 AM rdb...@gmail.com <rdb...@gmail.com> wrote: > Oh, I think I see. The upgrade to Hive 4 is just for the Hive metastore > support? When I read the thread, I thought that we weren't going to change > the metastore. That seems reasonable to me. Sorry for the confusion. > > On Fri, Dec 13, 2024 at 10:24 AM rdb...@gmail.com <rdb...@gmail.com> > wrote: > >> Sorry, I must have missed something. I don't think that we should upgrade >> anything in Iceberg to Hive 4. Why not simply remove the Hive support >> entirely? Why would anyone need Hive 4 support from Iceberg when it is >> built into Hive 4? >> >> On Thu, Dec 12, 2024 at 11:03 AM Daniel Weeks <dwe...@apache.org> wrote: >> >>> Hey Manu, >>> >>> I agree with the direction here, but we should probably hold a quick >>> procedural vote just to confirm since this is a significant change in >>> support for Hive. >>> >>> -Dan >>> >>> On Wed, Dec 11, 2024 at 5:19 PM Manu Zhang <owenzhang1...@gmail.com> >>> wrote: >>> >>>> Thanks all for sharing your thoughts. It looks there's a consensus on >>>> upgrading to Hive 4 and dropping hive-runtime. >>>> I've submitted a PR[1] as the first step. Please help review. >>>> >>>> 1. https://github.com/apache/iceberg/pull/11750 >>>> >>>> Thanks, >>>> Manu >>>> >>>> On Thu, Nov 28, 2024 at 11:26 PM Shohei Okumiya <oku...@apache.org> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I also prefer option 1. I have some initiatives[1] to improve >>>>> integrations between Hive and Iceberg. The current style allows us to >>>>> develop both Hive's core and HiveIcebergStorageHandler simultaneously. >>>>> That would help us enhance integrations. >>>>> >>>>> - [1] https://issues.apache.org/jira/browse/HIVE-28410 >>>>> >>>>> Regards, >>>>> Okumin >>>>> >>>>> On Thu, Nov 28, 2024 at 4:17 AM Fokko Driesprong <fo...@apache.org> >>>>> wrote: >>>>> > >>>>> > Hey Cheng, >>>>> > >>>>> > Thanks for the suggestion. The nightly snapshots are available: >>>>> https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/iceberg-core/, >>>>> which might help when working on features that are not released yet (eg >>>>> Nanosecond timestamps). Besides that, we should run RCs against Hive to >>>>> check if everything works as expected. >>>>> > >>>>> > I'm leaning toward removing Hive 2 and 3 as well. >>>>> > >>>>> > Kind regards, >>>>> > Fokko >>>>> > >>>>> > Op wo 27 nov 2024 om 20:05 schreef rdb...@gmail.com < >>>>> rdb...@gmail.com>: >>>>> >> >>>>> >> I think that we should remove Hive 2 and Hive 3. We already agreed >>>>> to remove Hive 2, but Hive 3 is not compatible with the project anymore >>>>> and >>>>> is already EOL and will not see a release to update it so that it can be >>>>> compatible. Anyone using the existing Hive 3 support should be able to >>>>> continue using older releases. >>>>> >> >>>>> >> In general, I think it's a good idea to let people use older >>>>> releases when these situations happen. It is difficult for the project to >>>>> continue to support libraries that are EOL and I don't think there's a >>>>> great justification for it, considering Iceberg support in Hive 4 is >>>>> native >>>>> and much better! >>>>> >> >>>>> >> On Wed, Nov 27, 2024 at 7:12 AM Cheng Pan <pan3...@gmail.com> >>>>> wrote: >>>>> >>> >>>>> >>> That said, it would be helpful if they continue running >>>>> >>> tests against the latest stable Hive releases to ensure that any >>>>> >>> changes don’t unintentionally break something for Hive, which >>>>> would be >>>>> >>> beyond our control. >>>>> >>> >>>>> >>> >>>>> >>> I believe we should continue maintaining a Hive Iceberg runtime >>>>> test suite with the latest version of Hive in the Iceberg repository. >>>>> >>> >>>>> >>> >>>>> >>> i think we can keep some basic Hive4 tests in iceberg repo >>>>> >>> >>>>> >>> >>>>> >>> Instead of running basic tests on the Iceberg repo, maybe let >>>>> Iceberg publish daily snapshot jars to Nexus, and have a daily CI in Hive >>>>> to consume those jars and run full Iceberg tests makes more sense? >>>>> >>> >>>>> >>> Thanks, >>>>> >>> Cheng Pan >>>>> >>> >>>>> >>>>