Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Jean-Baptiste Onofré
Hi Justin I think you are right, some 3rd parties (including Apache projects) are missing in the NOTICE file (you and I already mentioned that in a previous release). We should at least mention this. I pointed Apache Karaf NOTICE as example. I propose to not block releases due to that (as it's li

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
Hi, If you were wondering where all of this comes from, it is from section 4/4d of the Apache license. Kind Regards, Justin

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
Hi, See [1] for why that NOTCE copyright line should be considered -"though the ASF copyright line and any other portions of NOTICE must be considered for propagation.” Kind Regards, Justin 1. https://infra.apache.org/licensing-howto.html#bundle-asf-product

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
Hi, You may not be aware that I have voted on 1,000+ releases and helped refine and improve ASF policy on this over the years. I also hold a number of other relevant ASF roles. > There are two main issues with the presented arguments: > > 1. This isn't a bundled dependency, it is an attributio

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Daniel Weeks
There are two main issues with the presented arguments: 1. This isn't a bundled dependency, it is an attribution of a code snippet taken from another project 2. There is nothing in the NOTICE that would qualify as "relevant portions [to be] bubbled up" You seem to be asserting that this is both a

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
Hi, Also note that that comment you linked to also includes "Aside from Apache-licensed dependencies which supply NOTICE files of their own, it is uncommon for a dependency to require additions to NOTICE.” In this case, you do have Apache-licensed dependencies that do supply a NOTICE file. Ki

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
HI, > The ASF recommendation also clearly states: "Under normal circumstances, > there is no need to modify NOTICE to mention a bundled dependency." If you read that document carefully, it states this: - Under normal circumstances, there is no need to modify NOTICE to mention a bundled dependen

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Daniel Weeks
+1 (binding) Verified sigs/sums/license/build/test (Python 3.11.6) All checks out, -Dan On Thu, Apr 4, 2024 at 4:39 PM Hussein Awala wrote: > +1 (non-binding) > > - Verified signatures, checksums, and license > - Tested creating and reading a non-partitioned table with the Glue catalog > > On

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Daniel Weeks
Justin, We addressed these questions with regard to LICENSE and NOTICE files in the last release. This comment explains it well, which is why the NOTICE changes were reverted. The ASF recommendation

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Hussein Awala
+1 (non-binding) - Verified signatures, checksums, and license - Tested creating and reading a non-partitioned table with the Glue catalog On Fri, Apr 5, 2024 at 12:44 AM Justin Mclean wrote: > HI, > > Thanks for that. I don't understand "we don't bundle the code, but just > took some part of i

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
HI, Thanks for that. I don't understand "we don't bundle the code, but just took some part of it”. Either the code is in the source release or not in the source release; if any part of it is in the source release, then it is bundled. The LICENSE and NOTICE files need to relate to what is includ

Re: truncate transform over binary columns

2024-04-04 Thread Brian Hulette
> > Quick question: do you actually have an issue with truncate on binary > columns ? No issue - as a consumer of Iceberg metadata I'd just like to clarify if we should expect to see partition fields with truncated binary. I was initially coding against the spec and planned to reject a truncate

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Drew
+1 (non-binding) - verified signature and checksum are OK - verified RAT license check is OK - ran install, test, and test-s3 in python 3.11 - ran some manual tests with GlueCatalog Looks good, thanks Honah!! - Drew

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Fokko Driesprong
+1 (binding) - Checked the signature and the checksum - Ran the example notebooks against 0.6.1rc1 - Did some checks locally and looks all good! Thanks Honah for running the release! Kind regards, Fokko Op do 4 apr 2024 om 17:56 schr

Re: Time-based partitioning on long column type

2024-04-04 Thread Jean-Baptiste Onofré
Ah yes, milestone is fine. Thanks ! All good. Regards JB On Thu, Apr 4, 2024 at 5:12 PM Eduard Tudenhoefner wrote: > > There is the V3 Spec milestone where it's tracked (amongst other things). > > On Thu, Apr 4, 2024 at 9:44 AM Jean-Baptiste Onofré wrote: >> >> Hi Eduard, >> >> Thanks for the

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Honah J.
Hi Justin, Thanks for reviewing the release. There were some discussions about the NOTICE file in the 0.6.0 release: PR#410 and PR#413 . Here are the reasons why the following projects have not b

Re: Time-based partitioning on long column type

2024-04-04 Thread Eduard Tudenhoefner
There is the V3 Spec milestone where it's tracked (amongst other things). On Thu, Apr 4, 2024 at 9:44 AM Jean-Baptiste Onofré wrote: > Hi Eduard, > > Thanks for the update ! It makes sense to me. > > Maybe a GH label with spec or v3_spec would hel

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Jean-Baptiste Onofré
If you are interested, I have a presentation: how Apache works :) I would be more than happy to join with my "member/director/contributor" hat :) Regards JB On Thu, Apr 4, 2024 at 4:45 PM Brian Olsen wrote: > > That seems like a good start. I do agree there needs to be a better way to > promot

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Brian Olsen
That seems like a good start. I do agree there needs to be a better way to promote engagement among other members. Perhaps I can do my next LinkedIn show describing the review process, how Apache works, how to get started, and what NOT to do when submitting a PR. This will likely translate into a

Re: Materialized view integration with REST spec

2024-04-04 Thread Jean-Baptiste Onofré
Just to clarify: I think we have a consensus on the two possible options. So the vote could be helpful to have a consensus about which option. Anyway, we still have discussions going on on this topic :) Regards JB On Wed, Apr 3, 2024 at 10:02 PM Ryan Blue wrote: > > If there is consensus, great

Re: truncate transform over binary columns

2024-04-04 Thread Jean-Baptiste Onofré
Hi Brian, Welcome to this list :) Quick question: do you actually have an issue with truncate on binary columns ? The Truncate transform (in Iceberg API) supports BINARY using TruncateByteBuffer, so it would make sense to clearly state this in the spec. Regards JB On Thu, Apr 4, 2024 at 2:45 AM

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Jean-Baptiste Onofré
Hi Brian, Yeah, I agree with your points. That's why I would like to create a PR as a discussion base (that we can update thanks to everyone's comments). 1. I think we already have a consensus about "stale issue/PR" reminder. 2. The concern is more about "assign/reviewer list". Rethinking this po

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Brian Olsen
I think you both (JB and Ryan) have valid points. JB there absolutely is a need to address the scalability issue and we need to come up with a solution. I doubt there’s any disagreement that rising stale issues in the project should be ignored. Ryan’s concern also has merit from a different angle

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Jean-Baptiste Onofré
Anyway, I'm preparing a PR to illustrate the proposal. Regards JB On Thu, Apr 4, 2024 at 10:59 AM Ajantha Bhat wrote: > > Additionally, I propose allocating a brief 5-10 minute segment during each > Iceberg community sync. > During this time, attendees can highlight any pull requests needing at

Re: [PROPOSAL] Improvement on our PR flows

2024-04-04 Thread Ajantha Bhat
Additionally, I propose allocating a brief 5-10 minute segment during each Iceberg community sync. During this time, attendees can highlight any pull requests needing attention. In cases where a pull request has become stagnant due to a lack of reviews, committers can step forward to offer assistan

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Justin Mclean
Hi, I took a look at this, and the NOTICE file doesn't include the required information from the included Apache projects NOTICE files [1] Kind Regards, Justin 1. https://infra.apache.org/licensing-howto.html#alv2-dep

Re: [PROPOSAL] New REST Catalog Spec

2024-04-04 Thread Jean-Baptiste Onofré
Hi folks, Quick update: first of all, thanks a lot for your feedback in the doc ! It's great to see such interest in this proposal. According to your comments, I'm reworking the document, grouping in three main topics: - REST Catalog changes - Table/View Spec impacts - Related topics I started l

Re: Time-based partitioning on long column type

2024-04-04 Thread Jean-Baptiste Onofré
Hi Eduard, Thanks for the update ! It makes sense to me. Maybe a GH label with spec or v3_spec would help to see what is planned for v3 ? Regards JB On Thu, Apr 4, 2024 at 9:36 AM Eduard Tudenhoefner wrote: > > Type promotion from Long to Timestamp is on the roadmap for the V3 Spec, so > that

Re: [VOTE] Release Apache PyIceberg 0.6.1rc1

2024-04-04 Thread Jean-Baptiste Onofré
+1 (non binding) I checked: - Signatures and hashes are OK - ASF header is present (NB: I will create a PR to update to rat 0.16.1 which include some improvements/fixes on the check), PKG-INFO doesn't contain ASF header, but it's OK. - No binary found in the source distribution Thanks ! Regards J

Re: Time-based partitioning on long column type

2024-04-04 Thread Eduard Tudenhoefner
Type promotion from Long to Timestamp is on the roadmap for the V3 Spec, so that would be the preferred way. On Wed, Apr 3, 2024 at 10:38 AM Jean-Baptiste Onofré wrote: > Hi Manu > > TIMESTAMP_LONG type promotion could be the easiest way, it would