Hi Piotr Thanks for the reply. It’s a good point, I was thinking it would be convenient in REST, and could avoid the hassle of spec change. But you are right that it probably belongs at a lower level if we support this feature generally (like an additional boolean on snapshot).
Sorry to hijack the thread of the main topic, will start a proper thread on this when I get a chance. Thanks Szehon > On Jul 3, 2024, at 11:26 PM, Piotr Findeisen <piotr.findei...@gmail.com> > wrote: > > Hi Szehon, > > re listing 'removed' snapshots > > If I understand what you're saying is the following: Iceberg table format > requires users to first delete metadata information about files and only then > delete the files, and sometimes users want to order these events differently. > We can solve this within a REST catalog, because REST catalog is not limited > by the Iceberg spec. In particular, it can do copies of metadata and other > workarounds. > However, why wouldn't we choose to solve this within Iceberg format? A naive > person could think that it's conceptually trivial to mark a snapshot as > 'expired' to allow data file removal without removing all the snapshot > information yet. > Please help my understand the reasoning behind these tradeoffs. > > Best > PF > > > > > On Thu, 4 Jul 2024 at 02:26, Szehon Ho <szehon.apa...@gmail.com > <mailto:szehon.apa...@gmail.com>> wrote: >> Yes, I was chatting with Yufei about this, in the first glance I agree this >> would be nice to have. I always thought that metadata tables are important >> enough to spec somewhere, and I think this is a nice place to do it. There >> seems to be some overlap with existing calls (ie, you can get snapshots from >> table. and files from proposed Plan API), but it does seem valuable to get >> it in one place. >> >> If we can solve the 'big metadata' issue for PrePlan/PlanTable API's, it >> sounds like we can re-use the solution for files metadata tables. I'd >> perhaps leave out position_deletes one though, as it's mostly used >> internally and seems a bit too 'big' even for this. >> >> I wonder if we can even add an optional endpoint for listing 'removed' >> snapshots. I know it sounds weird, but when looking at metadata tables, >> the one question that I got a lot but could not answer is how to find when a >> data file is added (or a partition is added). If the snapshot is expired >> then it is no longer possible to trace that history. Users often expire >> snapshots to claw back disk space, but may necessarily want to delete the >> snapshot history. But I believe the REST catalog seems to have an >> opportunity in removeSnapshot to preserve the metadata of the old snapshot >> (up to some configured time). So we can query the snapshot metadata even >> after it expires, which I feel will be valuable. >> >> Thanks >> Szehon >> >> >> On Wed, Jul 3, 2024 at 3:04 PM Jack Ye <yezhao...@gmail.com >> <mailto:yezhao...@gmail.com>> wrote: >>> Hi Yufei, >>> >>> Interesting that we are thinking about similar things. I had this item as a >>> part of the roadmap discussion items in the catalog sync meeting, and then >>> I removed it before the meeting because I felt it's too early to discuss. >>> >>> My main concern for having server-side metadata tables is how we solve the >>> "big metadata" issue. The partitions, manifests, files table can easily >>> itself become a big table, and the REST server becomes inefficient in >>> retrieving results. It's the same old "HMS is too slow in iterating through >>> the partitions" problem. Iceberg kind of solves it by having this >>> information in Avro and in storage that can be scanned distributedly, but >>> with server-side metadata tables, we are technically re-introducing the >>> problem. >>> >>> Maybe one potential approach is to run those potentially large metadata >>> table scans through the PreplanTable and PlanTable APIs. Just a quick >>> thought for now, I need to think a bit more about this. >>> >>> Best, >>> Jack Ye >>> >>> >>> >>> >>> >>> On Wed, Jul 3, 2024 at 1:45 PM Yufei Gu <flyrain...@gmail.com >>> <mailto:flyrain...@gmail.com>> wrote: >>>> Hi folks, >>>> >>>> I'd like to discuss a new proposal to support server-side metadata tables. >>>> >>>> One of Iceberg's most advantageous features is the ability to inspect a >>>> table using metadata tables. For instance, we can query snapshots just >>>> like we query data rows using the following command: SELECT * FROM >>>> prod.db.table.snapshots; >>>> >>>> With the REST catalog, we can simplify this process further by providing >>>> metadata directly from REST endpoints. Here are several benefits of this >>>> approach: >>>> Engine Independence: The metadata tables do not rely on a specific >>>> implementation of an engine. The REST server returns the results directly. >>>> For example, the Rust Iceberg does not need to implement its own logic to >>>> query the snapshot table if it connects to a server with this capability. >>>> This reduces the complexity and development effort required for different >>>> clients and engines. >>>> Enabled New Use Cases: A catalog UI or Lakehouse UI can present a table's >>>> metadata (e.g., snapshot/partition list) without relying on an engine like >>>> Trino. This opens up possibilities for lightweight UIs and tools that can >>>> directly interact with the REST endpoints to retrieve and display metadata. >>>> Enhanced Performance: With server-side caching, the server-side metadata >>>> tables will perform better. Caching reduces the need to repeatedly compute >>>> or retrieve metadata, leading to faster response times and reduced load on >>>> the underlying storage systems. >>>> Here is the proposal in google doc: >>>> https://docs.google.com/document/d/1MVLwyMQtZ-7jewsQ0PuTvtJbpfl4HCoVdbowMqFTmfc/edit?usp=sharing >>>> >>>> Estimated read time: 5 mins >>>> >>>> Would really appreciate any feedback on this topic and proposal! >>>> >>>> >>>> Yufei