[DISCUSS] iceberg-rust 0.2.0 release

2024-01-30 Thread Renjie Liu
Hi, everyone: iceberg-rust has been under active development for several months, and it now has several features, so I want to use this thread to discuss delivering the first release of this crate. Why this first release 0.2.0? Before iceberg-rust

Re: Proposal for REST APIs for Iceberg table scans

2024-01-30 Thread Jack Ye
+1 for having the opaque plan tasks, that's probably the most flexible way forward. And let's call them *plan tasks* going forward to standardize the terminology. I think the name of the APIs can be determined based on the actual API shape. For example, if we centralize these 2 plan and pre-plan a

Re: Partition column order in rewrite manifests

2024-01-30 Thread Jack Ye
Yes, it is sufficient at least for the use case I am talking about. -Jack On Tue, Jan 30, 2024 at 7:46 PM Renjie Liu wrote: > To be more specific, I think it's sorting by the value after > transformation? > > On Wed, Jan 31, 2024 at 11:36 AM Amogh Jahagirdar > wrote: > >> Yeah I think being ab

Re: Partition column order in rewrite manifests

2024-01-30 Thread Renjie Liu
To be more specific, I think it's sorting by the value after transformation? On Wed, Jan 31, 2024 at 11:36 AM Amogh Jahagirdar wrote: > Yeah I think being able to specify the order of the columns to sort by > when rewriting the manifests makes a lot of sense. > > On Tue, Jan 30, 2024 at 5:47 PM

Re: Partition column order in rewrite manifests

2024-01-30 Thread Amogh Jahagirdar
Yeah I think being able to specify the order of the columns to sort by when rewriting the manifests makes a lot of sense. On Tue, Jan 30, 2024 at 5:47 PM Renjie Liu wrote: > Sounds reasonable to me. > > On Wed, Jan 31, 2024 at 7:56 AM wrote: > >> Sounds like a reasonable thing to add? Maybe we

Re: Partition column order in rewrite manifests

2024-01-30 Thread Renjie Liu
Sounds reasonable to me. On Wed, Jan 31, 2024 at 7:56 AM wrote: > Sounds like a reasonable thing to add? Maybe we could check cardinality to > pick out the default order as well? > Sent from my iPhone > > On Jan 30, 2024, at 3:50 PM, Jack Ye wrote: > >  > Hi everyone, > > Today, the rewrite ma

Re: Gravitino an Iceberg REST catalog service

2024-01-30 Thread Jack Ye
+1 for using test-jar! -Jack On Fri, Jan 26, 2024 at 10:48 AM Ryan Blue wrote: > I think I'd be fine exposing this through a test Jar, but it seems to me > that if we were to put it into a normal package it would turn into the > situation we want to avoid. People would use it for unintended pur

Re: Partition column order in rewrite manifests

2024-01-30 Thread russell . spitzer
Sounds like a reasonable thing to add? Maybe we could check cardinality to pick out the default order as well?Sent from my iPhoneOn Jan 30, 2024, at 3:50 PM, Jack Ye wrote:Hi everyone,Today, the rewrite manifest procedure always orders the data files based on their data_file.partition value. Spec

Re: [DISCUSS] Release new Iceberg docs site in the main repository

2024-01-30 Thread Jack Ye
Sorry for the late vote, +1 and thanks for the great work! -Jack On Tue, Jan 30, 2024 at 7:22 AM Eduard Tudenhoefner wrote: > +1, thanks for working on this Brian. > > On Tue, Jan 30, 2024 at 12:02 AM Ryan Blue wrote: > >> It looks like we have lazy consensus, so we'll go ahead with the >> swi

Re: [PROPOSAL] Create user mailing list ?

2024-01-30 Thread Jack Ye
+1 for having a user mailing list. Do we envision the slack bot to be used for people in slack to participate in user list conversations, or the other way around, or both? Allowing people in slack to participate in user list conversations seems pretty achievable. Allowing people in the user list

Partition column order in rewrite manifests

2024-01-30 Thread Jack Ye
Hi everyone, Today, the rewrite manifest procedure always orders the data files based on their *data_file.partition* value. Specifically, it sorts data files that have the same partition value, and then does a repartition by range based on the target number of manifest files (ref

Re: Spec change for multi-arg transform

2024-01-30 Thread Szehon Ho
Sorry I may have misunderstood the statement and maybe this is specific to multi-arg transform, in any case let's get a spec pr earlier in to discuss/specify behavior for V1-2 vs 3. Thanks Szehon On Tue, Jan 30, 2024 at 9:23 AM Szehon Ho wrote: > Thanks all for the discussion. > > For the speci

Re: Spec change for multi-arg transform

2024-01-30 Thread Szehon Ho
Thanks all for the discussion. For the specific point about any new transform being able to be read in current versions but only written in V3 (which I missed as well): While this is a v3 feature and must be supported for v3 compatibility, the > community usually also has guidelines for using fea

Re: [PROPOSAL] Create user mailing list ?

2024-01-30 Thread Jean-Baptiste Onofré
AFAIR, some ASF projects are using slackbot to receive users requests from the mailing list and can send messages to the mailing list. Let me do a quick research and get back to you. Regards JB On Tue, Jan 30, 2024 at 3:14 PM Brian Olsen wrote: > > I do like the idea of making the Slack threads

Re: [DISCUSS] Release new Iceberg docs site in the main repository

2024-01-30 Thread Eduard Tudenhoefner
+1, thanks for working on this Brian. On Tue, Jan 30, 2024 at 12:02 AM Ryan Blue wrote: > It looks like we have lazy consensus, so we'll go ahead with the > switch-over so we don't need to go through the old process for the 1.5.0 > release. > > Thanks to Brian for pushing this forward, and to ev

Re: [PROPOSAL] Create user mailing list ?

2024-01-30 Thread Brian Olsen
I do like the idea of making the Slack threads available through the mailing list. Is there a slack bot you have in mind? How would the threads appear in the mailing list? On Tue, Jan 30, 2024 at 7:13 AM Jean-Baptiste Onofré wrote: > Hi guys, > > If we have a few user questions on the dev mailin

[PROPOSAL] Create user mailing list ?

2024-01-30 Thread Jean-Baptiste Onofré
Hi guys, If we have a few user questions on the dev mailing list, we have quite a number on Slack. It's completely fine but not easy to search the questions and find the concrete answer. As most other Apache projects do, I propose to create a user mailing list to invite people to ask questions an