Re: [DISCUSS] Hive Support

2024-11-21 Thread Jean-Baptiste Onofré
Hi Manu It sounds like a plan. I think it makes sense to drop Hive 2 & 3 and encourage use of Hive 4 (mostly documentation task). Regards JB On Wed, Nov 20, 2024 at 7:19 AM Manu Zhang wrote: > > Okay, let me add this option > > D. Drop Hive 2 & 3 support and suggest to use built-in Iceberg supp

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread Jean-Baptiste Onofré
+1 Regards JB On Tue, Nov 19, 2024 at 9:18 AM Fokko Driesprong wrote: > > Hi everyone, > > Based on the positive feedback on the [DISCUSS] thread and the pull-request > on GitHub, I would like to raise a vote to deprecate and remove the > last-column-id field from the spec. Since this is a spe

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-21 Thread Jean-Baptiste Onofré
+1 for Wednesday 9am PST every 3 weeks. Thanks ! Regards JB On Thu, Nov 21, 2024 at 12:48 AM Honah J. wrote: > > Hi everyone, > > Thank you all for your participation in the catalog community sync so far! > I'm writing to discuss changes to the meeting schedule to better fit > everyone's avail

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-21 Thread Jean-Baptiste Onofré
Hi I think Hussein fix is a good candidate for 1.7.1 as it's bug introduced in 1.7.0. I'm +1 for a 1.7.1 at least including fixes mentioned by Bryan and also the fix from Hussein. Regards JB On Thu, Nov 21, 2024 at 2:00 PM Hussein Awala wrote: > > Hi Bryan, > > I think https://github.com/apache

[DISCUSS] PyIceberg 0.8.1 release

2024-11-21 Thread Fokko Driesprong
Hi everyone, I suggest following up on the PyIceberg 0.8.0 release with a patch release. Currently, we have two candidate bugfixes to be included: - An issue where it falsely emits a warning when loading a table. - Another issue

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Zoltán Borók-Nagy
Hi, I agree with Gabor that the support of efficiently reloading Iceberg tables is a generic problem that applies to all catalog implementations. I also think that the programming API, especially the Iceberg Java library is very important, as almost all Iceberg clients use this library to interact

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Zoltán Borók-Nagy
Sorry, one more thing about the methods: Table reloadTable(Table); // or, Table reloadTable(TableIdentifier, Table) // where Table could be NULL I want to highlight that it is super easy to provide a default implementation which just loads the table. Then later, catalog implementations can ju

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-21 Thread Hussein Awala
Hi Bryan, I think https://github.com/apache/iceberg/pull/11609 should also be released in 1.7.1 as it fixes a bug in Kafka Connect introduced in 1.7.0 by https://github.com/apache/iceberg/pull/11220. Hussein On Thu, Nov 21, 2024 at 3:22 AM Yufei Gu wrote: > Hi Bryan, > > This bug fix has been

Re: [DISCUSS] Hive Support

2024-11-21 Thread Péter Váry
Hi Team, Just to clarify. Hive 3 officially doesn't support Java 11, and there are no plans to release a new Hive 3 version with support. By "accident" the Hive Metastore tests are running with Hive 3 with Java 11, but the Hive runtime tests are not running (Starting the HiveServer fails, so no te

Re: [DISCUSS] PyIceberg 0.8.1 release

2024-11-21 Thread Jean-Baptiste Onofré
Hi Fokko It makes sense to me. Regards JB On Thu, Nov 21, 2024 at 9:14 AM Fokko Driesprong wrote: > > Hi everyone, > > I suggest following up on the PyIceberg 0.8.0 release with a patch release. > > Currently, we have two candidate bugfixes to be included: > > An issue where it falsely emits a

Re: Dynamic Flink Iceberg Sink

2024-11-21 Thread Péter Váry
Many of the Flink users support the Dynamic Sink. See: https://lists.apache.org/thread/khw0z63n34cmh2nrzrx7j9bdmzz861lb Any comments from the Iceberg community side? Jean-Baptiste Onofré ezt írta (időpont: 2024. nov. 13., Sze, 14:06): > Thanks for the proposal! > I will take a look asap. > > Re

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Gabor Kaszab
Hey, I think there is one open question here where we disagree: It's the proposed function on the Catalog API (not the REST spec). I don't think we can ever include a parameter like ETag at this level of abstraction. The Catalog API is common for all the catalog implementations and is not just for

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread Fokko Driesprong
Hey Manu, That's an excellent question. I took the following rationale: - For the code, the iceberg-core module, a minor release deprecation cycle is required . - For the spec, I noticed that the deprecation of the

Re: Dynamic Flink Iceberg Sink

2024-11-21 Thread Ferenc Csaky
Hello devs, +1 from my side, as I look things from the Flink perspective. The Flink mailing list thread Peter linked in his previous message already has more supporters who are agreeing this feature would be pretty helpful regarding CDC tasks as well. Multiple users (including us) are looking f

Re: [DISCUSS] Hive Support

2024-11-21 Thread Péter Váry
I would prefer B, and only revert to A if we find that B becomes too complicated. On Fri, Nov 22, 2024, 04:26 Manu Zhang wrote: > Hi Peter, > > Would you be more specific on which option above do you prefer? > > Thanks, > Manu > > On Thu, Nov 21, 2024 at 10:07 PM Péter Váry > wrote: > >> Hi Tea

Re: [DISCUSS] Iceberg 1.7.1 release

2024-11-21 Thread Bryan Keller
Given it is a regression, I think it makes sense to fix it for 1.7.1. We'll try to get it in! -Bryan > On Nov 21, 2024, at 5:00 AM, Hussein Awala wrote: > > Hi Bryan, > > I think https://github.com/apache/iceberg/pull/11609 should also be released > in 1.7.1 as it fixes a bug in Kafka Connec

Re: [DISCUSS] Deprecate embedded manifests

2024-11-21 Thread Steve Zhang
+1 to deprecate Thanks, Steve Zhang > On Nov 19, 2024, at 3:32 AM, Fokko Driesprong wrote: > > Hi everyone, > > I would like to propose to deprecate embedded manifests > . This has been used before the > manifest-list was introduced, but I don

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-21 Thread rdb...@gmail.com
+1 for every 3 weeks instead of 2 out of 3. On Thu, Nov 21, 2024 at 10:57 AM Dmitri Bourlatchkov wrote: > Thanks for keeping track of this, Honah! > > +1 to keep the Wednesday 9 AM Pacific Time meeting every 3 weeks > > I'm ok to pause the 8 PM PST meeting - this time does not work for me > pers

Re: [VOTE] Deprecate and remove last-column-id

2024-11-21 Thread rdb...@gmail.com
+1 On Thu, Nov 21, 2024 at 5:22 AM Jean-Baptiste Onofré wrote: > +1 > > Regards > JB > > On Tue, Nov 19, 2024 at 9:18 AM Fokko Driesprong wrote: > > > > Hi everyone, > > > > Based on the positive feedback on the [DISCUSS] thread and the > pull-request on GitHub, I would like to raise a vote to

[VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-21 Thread Bryan Keller
Hi Everyone, I propose that we release the following RC as the official Apache Iceberg 1.7.1 release. The commit ID is 4a432839233f2343a9eae8255532f911f06358ef * This corresponds to the tag: apache-iceberg-1.7.1-rc1 * https://github.com/apache/iceberg/commits/apache-iceberg-1.7.1-rc1 * https://

Re: [DISCUSS] Hive Support

2024-11-21 Thread Manu Zhang
Hi Peter, Would you be more specific on which option above do you prefer? Thanks, Manu On Thu, Nov 21, 2024 at 10:07 PM Péter Váry wrote: > Hi Team, > > Just to clarify. Hive 3 officially doesn't support Java 11, and there are > no plans to release a new Hive 3 version with support. > By "acci

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-11-21 Thread Szehon Ho
Hi, It's a good performance find and improvement. Left some comment on the PR. IMO, the behavior actually more matches the API javadoc ("Check whether table exists"), not whether it is corrupted or not, so I'm supportive of it. Thanks Szehon On Thu, Nov 21, 2024 at 10:57 AM Steve Zhang wrote

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-21 Thread Yufei Gu
Hi Bryan, This link seems broken, https://dist.apache.org/repos/dist/dev/iceberg/KEYS. Should we use another one, like the one in here https://downloads.apache.org/iceberg/KEYS? Yufei On Thu, Nov 21, 2024 at 2:36 PM Bryan Keller wrote: > Hi Everyone, > > I propose that we release the followin

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Taeyun Kim
Hi, - On the Function: The function signature I propose is as follows (slightly modified from my previous suggestion): Option(Table, Option(VersionIdentifier)) loadTableIfChanged(TableIdentifier, Option(VersionIdentifier)) The key difference from Gabor’s proposal is that the caller manages th

Re: [Discuss] Simplify tableExists API in HiveCatalog

2024-11-21 Thread Kevin Liu
Hi Steve, This makes sense to me. The semantics of `tableExists` focus on whether a table's name exists in the catalog. For the Hive catalog, checking the HMS entry should be sufficient. I do have a question about usage, though. Typically, I would use ` tableExists` like this: ``` if (!tableExis

Re: [VOTE] Release Apache Iceberg 1.7.1 RC1

2024-11-21 Thread Xuanwo
Yes, let's consistently use https://downloads.apache.org/iceberg/KEYS as the definitive source for our KEYS. On Fri, Nov 22, 2024, at 13:36, Yufei Gu wrote: > Hi Bryan, > > This link seems broken, https://dist.apache.org/repos/dist/dev/iceberg/KEYS. > Should we use another one, like the one in

Re: [DISCUSS] PyIceberg 0.8.1 release

2024-11-21 Thread Kevin Liu
Thanks for starting this thread! Along with the 2 issues listed above, I propose this issue as well * Ignore tables with missing table_type parameter in HMS and Glue (#1331 ) Best, Kevin Liu On Thu, Nov 21, 2024 at 5:18 AM Jean-Baptiste Onofr

Re: [DISCUSS] Deprecate embedded manifests

2024-11-21 Thread rdb...@gmail.com
Can we safely deprecate and remove this? The manifest list is required in v2, but the spec has stated for a long time that v1 tables can use manifests rather than a manifest list. It’s unlikely, but it would be valid for other implementations to produce it. I would understand if other implementati

Re: [DISCUSS] Deprecate embedded manifests

2024-11-21 Thread Szehon Ho
+1, great to have less possible paths. Thanks Szehon On Thu, Nov 21, 2024 at 10:33 AM Steve Zhang wrote: > +1 to deprecate > > Thanks, > Steve Zhang > > > > On Nov 19, 2024, at 3:32 AM, Fokko Driesprong wrote: > > Hi everyone, > > I would like to propose to deprecate embedded manifests >

[Discuss] Simplify tableExists API in HiveCatalog

2024-11-21 Thread Steve Zhang
Hi Iceberger, I have a proposal to simplify the tableExists API in the Hive catalog, which involves a behavior change, and I’d like to hear your thoughts. Currently, in our catalog interface[1], the tableExists method is implemented as a default API by invoking the loadTable method. It retu

Re: [Discuss] Proposal to Adjust Catalog Sync Schedule & Cancel Next Wednesday’s Meeting

2024-11-21 Thread Dmitri Bourlatchkov
Thanks for keeping track of this, Honah! +1 to keep the Wednesday 9 AM Pacific Time meeting every 3 weeks I'm ok to pause the 8 PM PST meeting - this time does not work for me personally. As for two time slots recurring every 6 weeks each, IMHO, if people in those meetings end up being distinct