Re: [DISCUSS] September board report

2024-09-10 Thread Steven Wu
> Flink Range distribution for Sinks It is already included in Ryan's draft > Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0 This is still ongoing. There is a blocking issue with FileIOParser on HadoopFileIO: https://github.com/apache/iceberg/pull/10926 On Tue, Sep 10

Re: [DISCUSS] September board report

2024-09-10 Thread Péter Váry
Maybe mention some Flink ongoing tasks, improvements: - Flink Range distribution for Sinks - Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0 - Flink Sink V2 implementation to prepare for Flink 2.0 - Flink Table Maintenance (ongoing) Thanks for preparing this Ryan! Peter

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-10 Thread Benny Chow
Hi Walaa, I don't think the current view spec implicitly assumes a common catalog name between engines. I tested this by not specifying the default-catalog and both engines could look up the correct table under the shared default-namespace even when each engine uses a different catalog name. Hi J

Re: [DISCUSS] September board report

2024-09-10 Thread Matt Topol
There's one additional point to add for the Go implementation, we implemented file scan planning. It returns the list of file scan tasks needed for a given table, partitions and filter expression. --Matt On Tue, Sep 10, 2024, 5:43 PM rdb...@gmail.com wrote: > Hi everyone, > > It’s time for anot

[DISCUSS] September board report

2024-09-10 Thread rdb...@gmail.com
Hi everyone, It’s time for another ASF board report! Here’s my current draft. Please reply if you think there is something that I should add or change. Thanks! Ryan Description: Apache Iceberg is a table format for huge analytic datasets that is designed for high performance and ease of use. Pro

Re: Greater Seattle Iceberg Meetup

2024-09-10 Thread Jonathan Leang
The Apache Iceberg community is meeting up in Seattle next Tuesday! We have two great speakers lined up on exciting developments and insights for the project: Vince Kulandaisamy (@ Cloudera): Reducing TCO with Modern Data Architecture Powered by Apache Iceberg and REST Catalog Benny Chow (@ Dremio

Re: Time-based partitioning on long column type

2024-09-10 Thread rdb...@gmail.com
Maybe we could update the time-based partition functions to be applied to a long column directly. It would treat that column like a timestamp in milliseconds. Would that work? I need to think more about the implications of doing that, but I don't think that we currently have an issue with extending

Re: [VOTE] Merge REST Spec Change To Add New Scan Planning APIs

2024-09-10 Thread Amogh Jahagirdar
Thanks Rahil for driving this and everyone for reviewing! Merged the PR. Thanks, Amogh Jahagirdar On Tue, Sep 10, 2024 at 8:24 AM Chertara, Rahil wrote: > Thanks all for reviewing the pr. With 4 binding and 1 non-binding votes, > the vote has passed. Can a maintainer please merge the pr when t

Re: [DISCUSS] Iceberg Materialzied Views

2024-09-10 Thread Jan Kaul
Thanks Walaa and Benny for clarifying the problem. I think I have a better understanding now. Sorry for being a bit stubborn before. Wouldn't it make sense then to store the lineage as part of the representation: {     "type": "sql",     "sql": "SELECT\n COUNT(1), CAST(event_ts AS DATE)\nFR