Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread Fokko Driesprong
Thanks everyone for the input here, and I agree that the aforementioned #995 and #997 by Sung, and #526 by André would also be good to includ

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Piotr Findeisen
Hi, Walaa, thanks for asking! In the design doc linked before in this thread [1] i read "Without a common standard, the UDFs are hard to share among different engines." ("Background and Motivation" section). I agree with this statement. I don't fully understand yet how the proposed design address

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-08 Thread Eduard Tudenhöfner
I've opened #10877 a few days ago to make the namespace separator configurable and let servers communicate to clients which separator should be used. Worth mentioning that this doesn't require any spec chance and it is backwards compatible with older c

Re: [DISCUSS] PyIceberg: Remove optional support for instance-level identifier in Catalog and Table APIs

2024-08-08 Thread Fokko Driesprong
Hey Sung, Thanks for raising this. This was also for a very long time on my list, but I was reluctant to do this because of the incompatible change as you already mentioned, however, I think it is good to remove this rather sooner than later. I just went over the PR

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-08 Thread Fokko Driesprong
Thanks Eduard for bringing raising all the PRs. I like the approach of the server-side configuration, that way the catalog is in charge of providing a character that's suitable for them. Kind regards, Fokko Op do 8 aug 2024 om 12:27 schreef Eduard Tudenhöfner < etudenhoef...@apache.org>: > I've

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Fokko Driesprong
Coming from PyIceberg, I have concerns as this proposal focuses on SQL-based engines, while Python-based systems often work with data frames. Adding imperative languages like Python would make this proposal more inclusive. Kind regards, Fokko Op do 8 aug 2024 om 10:27 schreef Piotr Findeisen :

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread André Luis Anastácio
I fixed an overwrite error that, I think, would be good to include in the 0.7.1 release https://github.com/apache/iceberg-python/pull/1023 André Anastácio On Thursday, August 8th, 2024 at 4:29 AM, Fokko Driesprong wrote: > Thanks everyone for the input here, and I agree that the aforementione

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-08 Thread Dmitri Bourlatchkov
Thanks, Eduard, for 10877 [1] and addressing my concerns quickly :) This approach is fine from my POV, although I personally prefer the flexibility of what Jack proposed. Cheers, Dmitri. [1] https://github.com/apache/iceberg/pull/10877 On Thu, Aug 8, 2024 at 6:28 AM Eduard Tudenhöfner wrote:

Re: [DISCUSS] Changing namespace separator in REST spec

2024-08-08 Thread Eduard Tudenhöfner
@Dmitri thanks for the quick review. The query param is actually being handled in #10904 (spec) and in #10905 (impl) and is just a change on top of #10877 . E

[DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Fokko Driesprong
Hi everyone, In light of the recent discussion of releasing artifacts separately [1]. I would like to discuss releasing Java 1.11.4. Since Java 1.12.0 only supports JDK11+ I think it is important to also do a release of Java (which includes several CVE patches). I would like to hear if there are a

Re: [DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Ryan Blue
+1 for releasing Avro Java separately. On Thu, Aug 8, 2024 at 8:28 AM Fokko Driesprong wrote: > Hi everyone, > > In light of the recent discussion of releasing artifacts separately [1]. I > would like to discuss releasing Java 1.11.4. Since Java 1.12.0 only > supports JDK11+ I think it is import

Re: [DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Xuanwo
+1 non-binding I support to do release separately which can reduce our verify burden. On Fri, Aug 9, 2024, at 00:26, Ryan Blue wrote: > +1 for releasing Avro Java separately. > > On Thu, Aug 8, 2024 at 8:28 AM Fokko Driesprong wrote: >> Hi everyone, >> >> In light of the recent discussion of r

Re: [DISCUSS] Release Avro Java 1.11.4

2024-08-08 Thread Fokko Driesprong
Hey all, I accidentally sent this to the Iceberg devlist instead of Avro. Thanks Steven for pinging me. Sorry for the noise! Kind regards, Fokko Op do 8 aug 2024 om 18:33 schreef Xuanwo : > +1 non-binding > > I support to do release separately which can reduce our verify burden. > > On Fri, Aug

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Dmitri Bourlatchkov
I do not think the spec is meant to allow only SQL representations, although it is certainly faviouring SQL in examples... It would be nice to add a non-SQL example, indeed. Cheers, Dmitri. On Thu, Aug 8, 2024 at 9:00 AM Fokko Driesprong wrote: > Coming from PyIceberg, I have concerns as this p

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Ryan Blue
Right now, SQL is an explicit requirement of the spec. It leaves a way for future versions to add different representations later, but only SQL is supported. That was also the feedback to my initial skepticism about how it would work to add functions. On Thu, Aug 8, 2024 at 12:44 PM Dmitri Bourlat

Re: [Discussion] Versioned SQL UDFs (Catalog routines) in Iceberg

2024-08-08 Thread Dmitri Bourlatchkov
The UDF spec does not require representations to be SQL. It merely does not specify (in this revision) how other representations are to be written. This seems like an easy extension (adding a new type in the "Representations" section). Cheers, Dmitri. On Thu, Aug 8, 2024 at 3:47 PM Ryan Blue wr

Re: [DISCUSS] Iceberg 1.6.1 release

2024-08-08 Thread Fokko Driesprong
Hey Piotr, We had some delays with the Avro 1.12.0 release, mostly because all the languages were released at once. On the Avro devlist, I suggested releasing 1.11.4 just for Java because of the CVE. Realistically this would be around 1-2 weeks. Does that sound reasonable? Kind regards, Fokko Op

[DISCUSS] Materialized Views: Lineage and State information

2024-08-08 Thread Walaa Eldin Moustafa
Hi Everyone, In the last community sync on Materialized Views [1], we agreed to split the information that is used to determine the materialized view staleness to two parts: Lineage Information and State Information. We have made a lot of progress on representing both but one issue remains open:

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread Sung Yun
Thank you for reporting the issues and putting in the fixes Fokko and André. We also identified a correctness issue with applying positional deletes on merge-on-read tables that I think also must be included into this release. Here's the PR that resolves the issue: https://github.com/apache/iceber

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-08 Thread Benny Chow
Maybe a third option is to decouple the view lineage and materialization state. The view lineage can just list out the SQL identifiers+ref... we can still decide whether this is just direct children or fully expanded. The materialization state doesn't have to depend on the view lineage (through ei

Re: [DISCUSS] Materialized Views: Lineage and State information

2024-08-08 Thread Walaa Eldin Moustafa
Thanks Benny! We discussed this option during the meeting but we did not prefer it because we did not want to leak the SQL identifiers to the storage table since SQL identifiers are view concepts and fit better with the view. Thanks, Walaa. On Thu, Aug 8, 2024 at 4:12 PM Benny Chow wrote: > May