Hey everyone, Thanks to folks who attended. I added my notes from the last sync. Please feel free to add/correct if I missed anything.
Main points Highlights StreamingOffset for Structured Streaming in Spark New Actions API Spark procedure for partial import of existing tables Subsurface talks are online Call for papers is open at ApacheCon and Subsurface Releases 0.11.1 Waiting for the fix on handling situations when the metastore fails during commit (#2317). 0.12.0 Should include Spark 3.1 support V2 format items should be included whenever possible but should not block the release No new blockers Ideally, end of March Table corruption issue (#2317 <https://github.com/apache/iceberg/issues/2317>) We may corrupt tables if the metastore fails during commit and the commit state is unknown. Iceberg may delete files that were actually committed. A lot of folks have seen this issue. Parth has shared some thoughts from a discussion they had internally here <https://docs.google.com/document/d/1dN7gZwXmlI6Nl4RToAWgsMIsiJUCRSpfFfIL9Kr8s0k>. We can handle this issue in two phases: Don’t corrupt the table (Russell has a PR) Avoid duplicated results if operations are blindly retried (can be done in a follow-up PR) Seems worth including the first part in 0.11.1 V2 format Open points: Primary key or row id for upserts Propagating the sort order id for files on write Need more reviewers Encryption Multiple people expressed interested in data encryption. Existing work by John here <https://github.com/apache/iceberg/pull/1918>. Ideally, should leverage as much as possible of modular encryption in Parquet 1.12 discussed here <https://github.com/apache/iceberg/issues/1413>. Agreed to start a thread on the dev list. ChachingCatalog issues (#2319 <https://github.com/apache/iceberg/issues/2319>) The current behavior leads to stale data if multiple sessions are used. No ideal solution due to Spark limitations. Agreed to discuss in the issue. Multi-table transactions Jacques has proposed an API here <https://github.com/apache/iceberg/pull/1849> and is about to start working on an implementation. Agreed to collaborate on the dev list. More eyes would be great. The link to the doc: https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg <https://docs.google.com/document/d/1YuGhUdukLP5gGiqCbk0A5_Wifqe2CZWgOd3TbhY3UQg> Thanks, Anton