I definitely agree that we should ask the bigger Hive contributors for their input. I've asked some questions and tagged pvary and marton-bod on the Github issue to see if they've encountered this and for how long or have any knowledge./ insight.
As Ryan said, there are some very important correctness fixes that do need to go out as soon as possible. I'm also of the belief that we can absolutely cut another patch release as soon as is needed. I would be happy to volunteer to be the RM for 0.12.x - though that's up to Ryan if that's appropriate as I can't actually run the final steps so I'm roping him into it too =) I don't think we should block on this issue, as we don't have a fix and I haven't heard of anybody else complaining of it. I agree, it sounds important but there are other correctness concerns that very much at the least merit a release candidate or the beginning of cherry-picking. But I just want to emphasize that if the bug already exists in 0.12, then it's not necessarily a reason to block 0.12.1 as we can still add it if it gets fixed in time and then we can cut a 0.12.2 release pretty easily. And it's definitely not a reason to block the process of getting the RC candidate ready (it's my first time in Iceberg) as that will take a few days possibly and so there's time to work on it. And if the bug is fixed before 0.12.1 is fully released (after a few days of prep work as I'm new to doing releases for us, plus a vote on at least RC0, if not RC1, 2, etc), then it can probably be included. So I am 100% in support of working to solve this, but I don't necessarily see it as a required block on the process or even the release candidate. We can make another patch release pretty easily if we feel the need once it's fixed (which I imagine we will, but this is also the first report of this bug that I'm hearing of .... though others could exist for sure). The bulk of the work for making a patch release in a series of patch releases is in the first patch, .0 to .1 (at least for us). So releasing a 0.12.2 if need really be really wouldn't be too hard. However, the bug that really drove the need to cut a 0.12.1 release have affected a large number of users in production, who don't normally run forks and they've reported they've had to as a work around. So I don't think the two goals are mutually exclusive. I recognize the likely high importance of dealing with multi-table joins in Hive (I'll investigate but wait for the Hive folks to chime in hopefully), but I also think that waiting an extra week or two is really not the best idea when we can easily cut another release once we have a fix. There are several reports of a high priority bug that is affecting a number of users' regular work loads. That's my opinion anyway. And I have volunteered to be the release manager, so I can say that if need really be, we can cut a release (though we might reasonably wait a bit to get more feedback). Or that I'll do the work that I can for cutting it (a committer needs to sign but there's a script for that which I've tested). If many people / the community at large finds it a reason to block the 0.12.1 release, I wouldn't stand in their way. I think, if anything, until there's a fix, I continue working on getting 0.12.1 ready (as the fix would be one of the later ones to be applied as the PRs are cherry-picked in order). Hopefully I didn't come across too harsh or strongly, I want to emphasize I'm just the release manager. But that's how I see things currently, as we don't need to stop work on one to investigate a new issue. Either way, we'll still need the work we're doing now. And then if people really feel strongly, they can vote to wait. But I think that a number of users would like the complaint they've reported to be fixed so they don't have to fork anymore. Sometimes those of us who do fork forget that it's a real pain, especially if it's not part of your normal work flow. Thanks, Kyle On Wed, Oct 27, 2021 at 8:50 AM Ryan Blue <b...@tabular.io> wrote: > I'm not sure that #3393 is necessarily something that we should wait for. > If it gets in soon, I'd be all for including it. But there are some > important correctness fixes going into 0.12.1 for delete file commits and > I'd like to get those out as soon as possible. > > It looks like this bug affects Hive and is a failure, not a correctness > problem. I would probably opt to continue with 0.12.1 and follow up with > 0.12.2 once this is fixed if we think that it is affecting enough people > that a patch release is warranted. And if we don't think that a patch > release for this is needed, then I think that makes it less important to > get it into 0.12.1. > > What does everyone else think? Should we wait for this Hive fix? > > On Wed, Oct 27, 2021 at 3:17 AM OpenInx <open...@gmail.com> wrote: > >> I think we will need to fix this critical iceberg bug before we release >> the 0.12.1: https://github.com/apache/iceberg/issues/3393 . Let's mark >> it as a blocker for the 0.12.1. >> >> On Fri, Oct 22, 2021 at 3:22 AM Kyle Bendickson <k...@tabular.io> wrote: >> >>> Thank you everybody for the additional PRs brought up so far. >>> >>> I’ve volunteered to be release manager, so will be doing my best to go >>> through and ensure these are prioritized for consideration (if some are >>> truly new features they might need to wait for 0.13.0, but as I’m just the >>> release manager that will be mire up to the community). >>> >>> If any committers or contributors have free cycles and are willing to >>> review some of these PRs, that would be greatly appreciated! >>> >>> - Kyle Bendickson [@kbendick] >>> >>> On Thu, Oct 21, 2021 at 11:19 AM Peter Vary <pv...@cloudera.com.invalid> >>> wrote: >>> >>>> Just to make this clean https://github.com/apache/iceberg/pull/3338 fixes >>>> the issue caused by https://github.com/apache/iceberg/pull/2565. The >>>> fix will make Catalogs.loadCatalog consistent with Catalogs.hiveCatalog, >>>> and fixing create table issues when no catalog is set in the config >>>> >>>> On 2021. Oct 21., at 16:59, Peter Vary <pv...@cloudera.com> wrote: >>>> >>>> I would like to have this in 0.12.1: >>>> https://github.com/apache/iceberg/pull/3338 >>>> >>>> This breaks Hive queries, if no catalog is set, but this still needs to >>>> be reviewed before merge. >>>> >>>> Thanks, Peter >>>> >>>> >>>> On Thu, 21 Oct 2021, 07:12 Rajarshi Sarkar, <rsarkar...@gmail.com> >>>> wrote: >>>> >>>>> Hope this can get in: https://github.com/apache/iceberg/pull/3175 >>>>> >>>>> Regards, >>>>> Rajarshi Sarkar, >>>>> >>>>> >>>>> On Thu, Oct 21, 2021 at 9:08 AM Cheng Pan <cheng...@apache.org> wrote: >>>>> >>>>>> Hope this can get in. >>>>>> https://github.com/apache/iceberg/pull/3203 >>>>>> >>>>>> Thanks, >>>>>> Cheng Pan >>>>>> >>>>>> >>>>>> On Thu, Oct 21, 2021 at 11:34 AM Reo Lei <leinuo...@gmail.com> wrote: >>>>>> >>>>>>> Thanks Kyle for syncing this! >>>>>>> >>>>>>> I think PR#3240 should be include in this release. Because in our >>>>>>> Dingding group, we have received feedback from many flink users that >>>>>>> they >>>>>>> encountered this problem. I think this PR is very important and we need >>>>>>> to >>>>>>> fix this problem ASAP. >>>>>>> >>>>>>> link: https://github.com/apache/iceberg/pull/3240 >>>>>>> >>>>>>> BR, >>>>>>> Reo LEI >>>>>>> >>>>>>> Kyle Bendickson <k...@tabular.io> 于2021年10月21日周四 上午2:52写道: >>>>>>> >>>>>>>> As mentioned in today's community sync up, we're planning on >>>>>>>> releasing a new point version of Iceberg - Apache Iceberg 0.12.1. >>>>>>>> >>>>>>>> If there are any outstanding bugs you'd like to include fixes for >>>>>>>> or other minor patches, please respond to this email thread letting us >>>>>>>> know. >>>>>>>> >>>>>>>> The current list of patches to be included can be found in the >>>>>>>> milestone on Github: >>>>>>>> https://github.com/apache/iceberg/milestone/15?closed=1 >>>>>>>> >>>>>>>> As new items are added, they will be included in the milestone. >>>>>>>> >>>>>>>> Best, >>>>>>>> Kyle Bendickson [ Github: @kbendick ] >>>>>>>> >>>>>>> >>>> > > -- > Ryan Blue > Tabular >