Hi Szehon, re listing 'removed' snapshots
If I understand what you're saying is the following: Iceberg table format requires users to first delete metadata information about files and only then delete the files, and sometimes users want to order these events differently. We can solve this within a REST catalog, because REST catalog is not limited by the Iceberg spec. In particular, it can do copies of metadata and other workarounds. However, why wouldn't we choose to solve this within Iceberg format? A naive person could think that it's conceptually trivial to mark a snapshot as 'expired' to allow data file removal without removing all the snapshot information yet. Please help my understand the reasoning behind these tradeoffs. Best PF On Thu, 4 Jul 2024 at 02:26, Szehon Ho <szehon.apa...@gmail.com> wrote: > Yes, I was chatting with Yufei about this, in the first glance I agree > this would be nice to have. I always thought that metadata tables are > important enough to spec somewhere, and I think this is a nice place to do > it. There seems to be some overlap with existing calls (ie, you can get > snapshots from table. and files from proposed Plan API), but it does seem > valuable to get it in one place. > > If we can solve the 'big metadata' issue for PrePlan/PlanTable API's, it > sounds like we can re-use the solution for files metadata tables. I'd > perhaps leave out position_deletes one though, as it's mostly used > internally and seems a bit too 'big' even for this. > > I wonder if we can even add an optional endpoint for listing 'removed' > snapshots. I know it sounds weird, but when looking at metadata tables, > the one question that I got a lot but could not answer is how to find when > a data file is added (or a partition is added). If the snapshot is expired > then it is no longer possible to trace that history. Users often expire > snapshots to claw back disk space, but may necessarily want to delete the > snapshot history. But I believe the REST catalog seems to have an > opportunity in removeSnapshot to preserve the metadata of the old snapshot > (up to some configured time). So we can query the snapshot metadata even > after it expires, which I feel will be valuable. > > Thanks > Szehon > > > On Wed, Jul 3, 2024 at 3:04 PM Jack Ye <yezhao...@gmail.com> wrote: > >> Hi Yufei, >> >> Interesting that we are thinking about similar things. I had this item as >> a part of the roadmap discussion items in the catalog sync meeting, and >> then I removed it before the meeting because I felt it's too early to >> discuss. >> >> My main concern for having server-side metadata tables is how we solve >> the "big metadata" issue. The partitions, manifests, files table can easily >> itself become a big table, and the REST server becomes inefficient in >> retrieving results. It's the same old "HMS is too slow in iterating through >> the partitions" problem. Iceberg kind of solves it by having this >> information in Avro and in storage that can be scanned distributedly, but >> with server-side metadata tables, we are technically re-introducing the >> problem. >> >> Maybe one potential approach is to run those potentially large metadata >> table scans through the PreplanTable and PlanTable APIs. Just a quick >> thought for now, I need to think a bit more about this. >> >> Best, >> Jack Ye >> >> >> >> >> >> On Wed, Jul 3, 2024 at 1:45 PM Yufei Gu <flyrain...@gmail.com> wrote: >> >>> Hi folks, >>> >>> I'd like to discuss a new proposal to support server-side metadata >>> tables. >>> >>> One of Iceberg's most advantageous features is the ability to inspect a >>> table using metadata tables. For instance, we can query snapshots just like >>> we query data rows using the following command: SELECT * FROM >>> prod.db.table.snapshots; >>> >>> With the REST catalog, we can simplify this process further by providing >>> metadata directly from REST endpoints. Here are several benefits of this >>> approach: >>> >>> - Engine Independence: The metadata tables do not rely on a specific >>> implementation of an engine. The REST server returns the results >>> directly. >>> For example, the Rust Iceberg does not need to implement its own logic to >>> query the snapshot table if it connects to a server with this capability. >>> This reduces the complexity and development effort required for different >>> clients and engines. >>> - Enabled New Use Cases: A catalog UI or Lakehouse UI can present a >>> table's metadata (e.g., snapshot/partition list) without relying on an >>> engine like Trino. This opens up possibilities for lightweight UIs and >>> tools that can directly interact with the REST endpoints to retrieve and >>> display metadata. >>> - Enhanced Performance: With server-side caching, the server-side >>> metadata tables will perform better. Caching reduces the need to >>> repeatedly >>> compute or retrieve metadata, leading to faster response times and >>> reduced >>> load on the underlying storage systems. >>> >>> Here is the proposal in google doc: >>> https://docs.google.com/document/d/1MVLwyMQtZ-7jewsQ0PuTvtJbpfl4HCoVdbowMqFTmfc/edit?usp=sharing >>> >>> Estimated read time: 5 mins >>> >>> Would really appreciate any feedback on this topic and proposal! >>> >>> >>> Yufei >>> >>