Re: [VOTE] Release Apache Iceberg 1.9.0 RC0

2025-04-14 Thread Jean-Baptiste Onofré
Yes, agree (we can do both actually, but at least, at release prep :)). I will propose a check-legal script (as we have check-license for rat) in dev folder to check LICENSE/NOTICE content. Regards JB On Mon, Apr 14, 2025 at 11:58 AM Eduard Tudenhöfner wrote: > > I think rather than having to u

Re: Dealing with skew when write.distribution-mode=hash

2025-04-14 Thread namratha mk
Hi Ed, In the latest version of spark(>3.5), for both hash and range distribution mode we can control the size of partition by spark property "spark.sql.adaptive.advisoryPartitionSizeInBytes". This will control the small files problem. Regards, Namratha On Mon, Apr 7, 2025 at 8:44 AM Ed Mancebo

Re: Dealing with skew when write.distribution-mode=hash

2025-04-14 Thread Anton Okolnychyi
AQE in recent Spark versions should take care of any skew during writes. Make sure it is enabled and configured correctly. - Anton пн, 14 квіт. 2025 р. о 13:50 namratha mk пише: > Hi Ed, > > In the latest version of spark(>3.5), for both hash and range > distribution mode we can control the siz

Re: [DISCUSS] FileFormat API proposal

2025-04-14 Thread Jean-Baptiste Onofré
Hi Peter Awesome ! Thank you so much ! I will do a new pass. Regards JB On Fri, Apr 11, 2025 at 3:48 PM Péter Váry wrote: > > Hi JB, > > Separated out the proposed interfaces to a new PR: > https://github.com/apache/iceberg/pull/12774. > Reviewers can check that out if they are only interested

Re: [DISCUSS] Introducing Iceberg Features ?

2025-04-14 Thread Brian Hulette
As a consumer of Iceberg metadata I think something like this might be helpful. We used approach #2 for adding partial Iceberg V2 support to BigQuery external tables, but this was more straightforward as we just had to detect the existence of delete files. With V3 we will have to be very confident

[DISCUSS] Introducing Iceberg Features ?

2025-04-14 Thread Jean-Baptiste Onofré
Hi folks, I started to work on multi args transforms, and you probably saw Fokko's proposal about the way to deal with source-id/source-ids to ensure backward compatibility. While working on the changes on iceberg-core/iceberg-java, I'm wondering if we should not introduce Iceberg Features on met

Re: [DISCUSS] Fix CVE-2025-30065 on 1.8.x / 1.7.x / 1.6.x?

2025-04-14 Thread Ryan Blue
I agree with Fokko. It's a good idea to get a release out soon that has a fix for this, but we don't want to make unnecessary releases for things that aren't actual vulnerabilities. That's especially true in older branches, where we have reasonable guidelines for what goes in them already. It's bet

Re: [VOTE] Release Apache Iceberg 1.9.0 RC0

2025-04-14 Thread Jean-Baptiste Onofré
Another point that I saw: the kafka-connect runtime distributions (hive and main) are not published (not on Maven Staging repository, neither on dist). I guess we want to publish these distributions right ? Regards JB On Thu, Apr 10, 2025 at 12:26 AM Ajantha Bhat wrote: > > Hi Everyone, > > I pr

Re: [VOTE] Release Apache Iceberg 1.9.0 RC0

2025-04-14 Thread Eduard Tudenhöfner
I think rather than having to update the LICENSE/NOTICE files on every version bump (which might be forgotten and requires manual work) maybe we should do this once during release preparation. On Mon, Apr 14, 2025 at 11:47 AM Jean-Baptiste Onofré wrote: > -1 (non binding) > > I checked: > - sign

Re: [DISCUSS] Fix CVE-2025-30065 on 1.8.x / 1.7.x / 1.6.x?

2025-04-14 Thread Jean-Baptiste Onofré
Hi Manu, See my comments from few days ago (in the 1.9.x release discussion): https://lists.apache.org/thread/4c4hg85c8qxq4cznp3drnyro88qp0rjr Regards JB On Sat, Apr 12, 2025 at 4:50 PM Manu Zhang wrote: > > Hi all, > > https://nvd.nist.gov/vuln/detail/CVE-2025-30065 (10.0 critical) has been >

Re: [VOTE] Release Apache Iceberg 1.9.0 RC0

2025-04-14 Thread Jean-Baptiste Onofré
-1 (non binding) I checked: - signature and checksum are OK - build works - ASF header is present in all expected files - no binary file in the source distribution - LICENSE/NOTICE in aws-bundle is not correct: netty version should be 4.1.115.Final - LICENSE/NOTICE in azure-bundle is not correct:

Re: [DISCUSS] Fix CVE-2025-30065 on 1.8.x / 1.7.x / 1.6.x?

2025-04-14 Thread Fokko Driesprong
Hey Manu, I agree, and we often see people filing tickets for unrelated security vulnerabilities that are caught by their CI system. However, doing a new release will also unnecessarily alarm folks about a vulnerability that's not there. As a side effect, upgrading 1.7.x from Parquet 1.13.1 to Par