Re: New committer: Bryan Keller

2024-03-06 Thread Péter Váry
Congratulations Bryan!

Yufei Gu  ezt írta (időpont: 2024. márc. 5., K,
20:19):

> Congrats, Bryan!
>
> Yufei
>
>
> On Tue, Mar 5, 2024 at 10:37 AM Ryan Blue  wrote:
>
>> Congratulations, Bryan!
>>
>> On Tue, Mar 5, 2024 at 8:13 AM Jack Ye  wrote:
>>
>>> Congrats Bryan!
>>>
>>> -Jack
>>>
>>> On Tue, Mar 5, 2024 at 7:33 AM Amogh Jahagirdar 
>>> wrote:
>>>
 Congratulations Bryan! Very well deserved, thank you for all your
 contributions!

 On Tue, Mar 5, 2024 at 7:29 AM Steven Wu  wrote:

> Bryan, congratulations and thank you for your many contributions.
>
> On Tue, Mar 5, 2024 at 5:54 AM Bryan Keller  wrote:
>
>> Thanks everyone! I really appreciate it, Iceberg has been inspiring
>> to me, both the project itself and the people involved, so I’m thankful 
>> to
>> have been given the opportunity to contribute!
>>
>> On Tue, Mar 5, 2024 at 5:28 AM Mehul Batra 
>> wrote:
>>
>>> Congratulations Bryan!
>>>
>>> On Tue, Mar 5, 2024 at 1:50 PM Fokko Driesprong 
>>> wrote:
>>>
 Hi everyone,

 The Project Management Committee (PMC) for Apache Iceberg has
 invited Bryan Keller to become a committer and we are pleased to 
 announce
 that he has accepted.

 Bryan was contributing to Iceberg before it was even open-source,
 did a lot of work on the topic of metadata generation, and is now 
 leading
 the effort of migrating the Kafka Connect integration into OSS Iceberg.

 Being a committer enables easier contribution to the project since
 there is no need to go via the patch submission process. This should 
 enable
 better productivity. A PMC member helps manage and guide the direction 
 of
 the project.

 Please join me in congratulating Bryan.

 Cheers,
 Fokko

>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>


Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-06 Thread Robert Stupp

-0

JSON (De)serialization of commit-metrics is broken. See 
https://github.com/apache/iceberg/issues/9879



On 06.03.24 00:04, Ajantha Bhat wrote:

Hi Everyone,

I propose that we release the following RC as the official Apache 
Iceberg 1.5.0 release.


The commit ID is 2519ab43d654927802cc02e19c917ce90e8e0265
* This corresponds to the tag: apache-iceberg-1.5.0-rc6
* https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc6
* 
https://github.com/apache/iceberg/tree/2519ab43d654927802cc02e19c917ce90e8e0265


The release tarball, signature, and checksums are here:
* https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.5.0-rc6

You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on Nexus. The Maven repository 
URL is:
* 
https://repository.apache.org/content/repositories/orgapacheiceberg-1161/


Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 1.5.0
[ ] +0
[ ] -1 Do not release this because...

Only PMC members have binding votes, but other community members are 
encouraged to cast
non-binding votes. This vote will pass if there are 3 binding +1 votes 
and more binding

+1 votes than -1 votes.

- Ajantha


--
Robert Stupp
@snazy



Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-06 Thread Robert Stupp

Strike that - all good.

On 06.03.24 12:48, Robert Stupp wrote:

-0

JSON (De)serialization of commit-metrics is broken. See 
https://github.com/apache/iceberg/issues/9879



On 06.03.24 00:04, Ajantha Bhat wrote:

Hi Everyone,

I propose that we release the following RC as the official Apache 
Iceberg 1.5.0 release.


The commit ID is 2519ab43d654927802cc02e19c917ce90e8e0265
* This corresponds to the tag: apache-iceberg-1.5.0-rc6
* https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc6
* 
https://github.com/apache/iceberg/tree/2519ab43d654927802cc02e19c917ce90e8e0265


The release tarball, signature, and checksums are here:
* 
https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.5.0-rc6


You can find the KEYS file here:
* https://dist.apache.org/repos/dist/dev/iceberg/KEYS

Convenience binary artifacts are staged on Nexus. The Maven 
repository URL is:
* 
https://repository.apache.org/content/repositories/orgapacheiceberg-1161/


Please download, verify, and test.

Please vote in the next 72 hours.

[ ] +1 Release this as Apache Iceberg 1.5.0
[ ] +0
[ ] -1 Do not release this because...

Only PMC members have binding votes, but other community members are 
encouraged to cast
non-binding votes. This vote will pass if there are 3 binding +1 
votes and more binding

+1 votes than -1 votes.

- Ajantha



--
Robert Stupp
@snazy



Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-06 Thread Eduard Tudenhoefner
+1 (non-binding)

* validated checksum and signature
* checked license docs & ran RAT checks
* ran build and tests with JDK11
* built new docker images and ran through
https://iceberg.apache.org/spark-quickstart/. I've also created a new
notebook to show Spark + View support
* tested view support with Spark 3.5 + JDBC/REST catalog
* verified that https://github.com/apache/iceberg/pull/9853 fixes the issue
that Ryan found in RC4
* verified that there's no artifact being published for the
*iceberg-open-api* module

Eduard

On Wed, Mar 6, 2024 at 12:04 AM Ajantha Bhat  wrote:

> Hi Everyone,
>
> I propose that we release the following RC as the official Apache Iceberg
> 1.5.0 release.
>
> The commit ID is 2519ab43d654927802cc02e19c917ce90e8e0265
> * This corresponds to the tag: apache-iceberg-1.5.0-rc6
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc6
> *
> https://github.com/apache/iceberg/tree/2519ab43d654927802cc02e19c917ce90e8e0265
>
> The release tarball, signature, and checksums are here:
> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.5.0-rc6
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on Nexus. The Maven repository URL
> is:
> *
> https://repository.apache.org/content/repositories/orgapacheiceberg-1161/
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours.
>
> [ ] +1 Release this as Apache Iceberg 1.5.0
> [ ] +0
> [ ] -1 Do not release this because...
>
> Only PMC members have binding votes, but other community members are
> encouraged to cast
> non-binding votes. This vote will pass if there are 3 binding +1 votes and
> more binding
> +1 votes than -1 votes.
>
> - Ajantha
>


Re: [VOTE] Release Apache Iceberg 1.5.0 RC6

2024-03-06 Thread Jean-Baptiste Onofré
+1 (non binding)

- checksums and signatures are OK
- ASF headers are present
- No unexpected binary files in the source distribution
- Build OK with JDK11
- JdbcCatalog tested on Trino and Iceland
- No unexpected artifact distributed

Thanks !

Regards
JB

On Wed, Mar 6, 2024 at 12:04 AM Ajantha Bhat  wrote:
>
> Hi Everyone,
>
> I propose that we release the following RC as the official Apache Iceberg 
> 1.5.0 release.
>
> The commit ID is 2519ab43d654927802cc02e19c917ce90e8e0265
> * This corresponds to the tag: apache-iceberg-1.5.0-rc6
> * https://github.com/apache/iceberg/commits/apache-iceberg-1.5.0-rc6
> * 
> https://github.com/apache/iceberg/tree/2519ab43d654927802cc02e19c917ce90e8e0265
>
> The release tarball, signature, and checksums are here:
> * https://dist.apache.org/repos/dist/dev/iceberg/apache-iceberg-1.5.0-rc6
>
> You can find the KEYS file here:
> * https://dist.apache.org/repos/dist/dev/iceberg/KEYS
>
> Convenience binary artifacts are staged on Nexus. The Maven repository URL is:
> * https://repository.apache.org/content/repositories/orgapacheiceberg-1161/
>
> Please download, verify, and test.
>
> Please vote in the next 72 hours.
>
> [ ] +1 Release this as Apache Iceberg 1.5.0
> [ ] +0
> [ ] -1 Do not release this because...
>
> Only PMC members have binding votes, but other community members are 
> encouraged to cast
> non-binding votes. This vote will pass if there are 3 binding +1 votes and 
> more binding
> +1 votes than -1 votes.
>
> - Ajantha


Re: New committer: Bryan Keller

2024-03-06 Thread Daniel Weeks
Congratulations Bryan!  Very well deserved.

-Dan

On Wed, Mar 6, 2024, 3:14 AM Péter Váry  wrote:

> Congratulations Bryan!
>
> Yufei Gu  ezt írta (időpont: 2024. márc. 5., K,
> 20:19):
>
>> Congrats, Bryan!
>>
>> Yufei
>>
>>
>> On Tue, Mar 5, 2024 at 10:37 AM Ryan Blue  wrote:
>>
>>> Congratulations, Bryan!
>>>
>>> On Tue, Mar 5, 2024 at 8:13 AM Jack Ye  wrote:
>>>
 Congrats Bryan!

 -Jack

 On Tue, Mar 5, 2024 at 7:33 AM Amogh Jahagirdar 
 wrote:

> Congratulations Bryan! Very well deserved, thank you for all your
> contributions!
>
> On Tue, Mar 5, 2024 at 7:29 AM Steven Wu  wrote:
>
>> Bryan, congratulations and thank you for your many contributions.
>>
>> On Tue, Mar 5, 2024 at 5:54 AM Bryan Keller 
>> wrote:
>>
>>> Thanks everyone! I really appreciate it, Iceberg has been inspiring
>>> to me, both the project itself and the people involved, so I’m thankful 
>>> to
>>> have been given the opportunity to contribute!
>>>
>>> On Tue, Mar 5, 2024 at 5:28 AM Mehul Batra 
>>> wrote:
>>>
 Congratulations Bryan!

 On Tue, Mar 5, 2024 at 1:50 PM Fokko Driesprong 
 wrote:

> Hi everyone,
>
> The Project Management Committee (PMC) for Apache Iceberg has
> invited Bryan Keller to become a committer and we are pleased to 
> announce
> that he has accepted.
>
> Bryan was contributing to Iceberg before it was even open-source,
> did a lot of work on the topic of metadata generation, and is now 
> leading
> the effort of migrating the Kafka Connect integration into OSS 
> Iceberg.
>
> Being a committer enables easier contribution to the project since
> there is no need to go via the patch submission process. This should 
> enable
> better productivity. A PMC member helps manage and guide the 
> direction of
> the project.
>
> Please join me in congratulating Bryan.
>
> Cheers,
> Fokko
>

>>>
>>> --
>>> Ryan Blue
>>> Tabular
>>>
>>


Optimizations to GlueCatalog

2024-03-06 Thread Chetas Joshi
Hi Community,

I am working on loading iceberg data from S3 using Flink. I am using
GlueCatalog for storing the iceberg table metadata. I found that the
GlueCatalog’s loadTable call (implemented

in the abstract class BaseMetastoreCatalog) creates a new instance of
GlueTableOperations every time for a Glue table identifier. This instance
is initialized with shouldRefresh = true and hence it refreshes the
tableMetadata for a given table identifier every time the loadTable is
called for that tableIdentifier even though it was called in the recent
past. I am wondering why these tableOperation instances are not cached in
the catalog. I suggest the following changes in the newTableOps method

in the GlueCatalog (and other catalog impls) and would really appreciate
the community's feedback on this.

protected TableOperations newTableOps(TableIdentifier tableIdentifier) {

// tableCache is a Cache with key=tableIdentifier and
value=GlueTableOperations object

if (tableCache.containsKey(tableIdentifier)) {

   return tableCache.get(tableIdentifier)

} else {

   return new GlueTableOperations()

}
}

If you like the approach, I am happy to contribute to open source. Let me
know.

Thank you
Chetas


Re: Optimizations to GlueCatalog

2024-03-06 Thread Ryan Blue
Chetas, Iceberg has an implementation of what you're talking about. There's
a caching layer implemented as a catalog, `CachingCatalog`. That's turned
on by default in the Flink catalog but the default interval is 30s. Maybe
you need to extend that interval by setting `cache.expiration-interval-ms`
in your catalog config?

On Wed, Mar 6, 2024 at 11:52 AM Chetas Joshi  wrote:

> Hi Community,
>
> I am working on loading iceberg data from S3 using Flink. I am using
> GlueCatalog for storing the iceberg table metadata. I found that the
> GlueCatalog’s loadTable call (implemented
> 
> in the abstract class BaseMetastoreCatalog) creates a new instance of
> GlueTableOperations every time for a Glue table identifier. This instance
> is initialized with shouldRefresh = true and hence it refreshes the
> tableMetadata for a given table identifier every time the loadTable is
> called for that tableIdentifier even though it was called in the recent
> past. I am wondering why these tableOperation instances are not cached in
> the catalog. I suggest the following changes in the newTableOps method
> 
> in the GlueCatalog (and other catalog impls) and would really appreciate
> the community's feedback on this.
>
> protected TableOperations newTableOps(TableIdentifier tableIdentifier) {
>
> // tableCache is a Cache with key=tableIdentifier and 
> value=GlueTableOperations object
>
> if (tableCache.containsKey(tableIdentifier)) {
>
>return tableCache.get(tableIdentifier)
>
> } else {
>
>return new GlueTableOperations()
>
> }
> }
>
> If you like the approach, I am happy to contribute to open source. Let me
> know.
>
> Thank you
> Chetas
>
>


-- 
Ryan Blue
Tabular


Meeting Minutes 2024-03-06

2024-03-06 Thread Brian Olsen
Hey Iceberg Nation,

Here are the meeting minutes from today's meeting.

In today's sync, we welcome two committers, Bryan Keller and Renjie Liu!
Bryan has contributed a great deal of work to the project (
https://github.com/apache/iceberg/commits/main/?author=bryanck) over the
years, including his most recent contribution, the official Kafka Connect
integration (https://github.com/apache/iceberg/commits/main/kafka-connect).
Renjie has been the leading contributor to the iceberg-rust implementation (
https://github.com/apache/iceberg-rust/commits/main/?author=liurenjie1024).
Congratulations to both of you!

The latest 1.5.0 release candidate is passing voting right now, and we dive
into some of the changes to expect and call out the contributions from
various individuals.

Iceberg Summit is on the horizon and the CFP is now open!, please consider
submitting a talk (https://sessionize.com/iceberg-summit-2024/).

The second half of the sync then covers a thorough discussion around
heavily debated implementation details of materialized views that aligned a
lot of the discussion points happening on the mailing list, but didn't come
to a final consensus. Jack Ye will own providing a summary and facilitate
further discussions moving forward to ensure we're discussing the same
concepts and coming to a convergence of the different viewpoints.

Transcription/Recording: https://youtu.be/d4dEgAa1vKk

* Highlights
* New committer, Bryan Keller! ([Congrats!](
https://lists.apache.org/thread/361wozk0rpos8tmgfp2t17ygskm83m87))
* New committer, Renjie Liu!
* Virtual Iceberg Summit 2024. ([Announcement](
https://lists.apache.org/thread/9w47vqzfz6byzjpx90nhvrg366c58y1m), [CFP](
https://sessionize.com/iceberg-summit-2024/)) (Thanks, JB!)
* Added DataFile/DeleteFiles to REST spec (Thanks, Drew!)
* Added pagination to the REST spec (Thanks, Rahil!)
* Added view support to JDBC catalog (Thanks, JB!)
* Added EncryptingFileIO (Thanks, Gidon!)
* Fixed snapshot log with REPLACE TABLE (Thanks, Eduard!)
* Releases
* Iceberg 1.5.0
* Voting on the next release candidate for 1.5.0 ([Vote thread](
https://lists.apache.org/thread/syp2hwp53rhromt4711w709dfq4cmvcb))
* SHOW TABLES behavior
* InMemoryCatalog list behavior
* Capabilities in REST spec
* Discussion
* Materialized views discussion


Re: [ANNOUNCE] Iceberg Summit Call for Proposals

2024-03-06 Thread Ajantha Bhat
>From the website,

> Talks will be 30 minutes in length, inclusive of Q&A time.


Are there any plans for lightning talks, brief presentations lasting 5
minutes, during the event?
If so, do these talks need to be submitted in advance during the CFP (Call
for Proposals) process?

- Ajantha

On Tue, Mar 5, 2024 at 10:38 PM Ryan Blue  wrote:

> Hi everyone,
>
> I am excited to announce that we have approval to hold the Iceberg Summit!
> Iceberg Summit 2024 will be a 2 day virtual event and will be May 14 - 15.
>
> The call for presentations is open here –
> https://sessionize.com/iceberg-summit-2024/ – until *April 12*. It would
> be great to see a bunch of people from the community giving talks about
> their use cases and experiences using Iceberg, as well as helpful
> educational talks.
>
> Here are some ideas for the the type of talks we’re looking for:
>
>- Iceberg in production (case studies)
>- Best practices
>- Data architecture
>- Iceberg education: technology and features
>
> Many thanks to JB for his help organizing the summit!
>
> Looking forward to seeing your proposals.
>
> You can also register to attend here: https://iceberg-summit.org/
>
> Ryan
> --
> Ryan Blue
> Tabular
>