Re: [DISCUSS] Iceberg Summit 2025 ?

2024-09-27 Thread Russell Spitzer
I am really excited about the prospect of another Summit and also had a great time last year. I think we had a great selection of talks and I'm hoping we can do so again. I'm very much in support of having an in person element, I would love to have a chance to talk face to face with other members

Re: [EXTERNAL] Re: [DISCUSS] Column to Column filtering

2024-09-27 Thread Baldwin, Jennifer
Please see attached, I hope this provides you with more clarity on the use case we hope to support. Let me know if you have any further questions From: Russell Spitzer Date: Wednesday, September 18, 2024 at 6:15 PM To: dev@iceberg.apache.org Cc: jennifer.bald...@teradata.com.invalid Subject:

[VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-27 Thread rdb...@gmail.com
Hi everyone, I'd like to vote on PR #10955 that has been open for a while with the changes to add new type promotion cases. After discussion, the PR has been scoped down to keep complexity low. It now adds: * An `unknown` type for cases when only `nu

Re: [VOTE] Table v3 spec: Add unknown and new type promotion

2024-09-27 Thread Russell Spitzer
+1 (binding) On Fri, Sep 27, 2024 at 4:37 PM rdb...@gmail.com wrote: > Hi everyone, > > I'd like to vote on PR #10955 > that has been open for a > while with the changes to add new type promotion cases. After discussion, > the PR has been scoped dow

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-27 Thread rdb...@gmail.com
I'm okay with adding newFixedThreadPool as Steven suggests, but I don't think that solves the problem that these are used more widely than intended and without people knowing the behavior. Even though "non-exiting" is awkward, it is maybe a good option to call out behavior. +1 for Javadoc, and +1 f

[DISCUSS] Iceberg Summit 2025 ?

2024-09-27 Thread Jean-Baptiste Onofré
Hi folks, Last year in June we started to discuss the first edition of the Iceberg Summit (https://lists.apache.org/thread/cbgx1jlc9ywn618yod2487g498lgrkt3). The Iceberg Summit was in May 2024, and it was clearly a great community event, with a lot of nice talks. This first edition was fully virt

Re: Clarification on DayTransform Result Type

2024-09-27 Thread rdb...@gmail.com
The background is that the result of the day function and dates are basically the same: the number of days from the Unix epoch. When we started using metadata tables, we realized that a lot of people use the day function but then get a weird ordinal value out, but if we just change the type to `dat

Re: Clarification on DayTransform Result Type

2024-09-27 Thread Russell Spitzer
Good thing DateType is an Integer :) https://github.com/apache/iceberg/blob/113c6e7d62e53d3e3cb15b1712f3a1db473ca940/api/src/main/java/org/apache/iceberg/types/Type.java#L37 On Thu, Sep 26, 2024 at 8:38 PM Kevin Liu wrote: > Hey folks, > > While reviewing a PR to fix DayTransform in PyIceberg (#

Re: V3 Spec Changes

2024-09-27 Thread Micah Kornfield
For variant, the current plan on moving to Parquet is to mark the variant type as experimental. Would Iceberg depend on the experimental type or is V3 going to wait for a variant to be deemed non-experimental by the Parquet community? Thanks, Micah On Tue, Sep 24, 2024 at 9:52 AM Russell Spitzer

Re: [DISCUSS] Modify ThreadPools.newWorkerPool to avoid unnecessary Shutdown Hook registration

2024-09-27 Thread Steven Wu
> I don't think that solves the problem that these are used more widely than intended and without people knowing the behavior. Ryan, to solve this problem, I suggest we deprecate the current `newWorkerPool` with `newExitingWorkerPool`. This way, when people calls `newExitingWorkerPool`, the inten

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-27 Thread Benny Chow
>> storing the lineage is an optimization that can avoid recomputation/re-parsing. I don't think having the lineage is optimizing much over re-parsing the SQL. The most expensive part of SQL parsing is catalog access which has to happen with lineage anyway. Once the planner has the query tree, i