Hi Piotr

Thanks for the reply.  It’s a good point, I was thinking it would be convenient 
in REST, and could avoid the hassle of spec change.  But you are right that it 
probably belongs at a lower level if we support this feature generally (like an 
additional boolean on snapshot).

Sorry to hijack the thread of the main topic, will start a proper thread on 
this when I get a chance.

Thanks
Szehon

> On Jul 3, 2024, at 11:26 PM, Piotr Findeisen <piotr.findei...@gmail.com> 
> wrote:
> 
> Hi Szehon,
> 
> re listing 'removed' snapshots
> 
> If I understand what you're saying is the following: Iceberg table format 
> requires users to first delete metadata information about files and only then 
> delete the files, and sometimes users want to order these events differently.
> We can solve this within a REST catalog, because REST catalog is not limited 
> by the Iceberg spec. In particular, it can do copies of metadata and other 
> workarounds.
> However, why wouldn't we choose to solve this within Iceberg format? A naive 
> person could think that it's conceptually trivial to mark a snapshot as 
> 'expired' to allow data file removal without removing all the snapshot 
> information yet.
> Please help my understand the reasoning behind these tradeoffs.
> 
> Best
> PF
> 
> 
> 
> 
> On Thu, 4 Jul 2024 at 02:26, Szehon Ho <szehon.apa...@gmail.com 
> <mailto:szehon.apa...@gmail.com>> wrote:
>> Yes, I was chatting with Yufei about this, in the first glance I agree this 
>> would be nice to have.  I always thought that metadata tables are important 
>> enough to spec somewhere, and I think this is a nice place to do it.  There 
>> seems to be some overlap with existing calls (ie, you can get snapshots from 
>> table. and files from proposed Plan API), but it does seem valuable to get 
>> it in one place.  
>> 
>> If we can solve the 'big metadata' issue for PrePlan/PlanTable API's, it 
>> sounds like we can re-use the solution for files metadata tables.  I'd 
>> perhaps leave out position_deletes one though, as it's mostly used 
>> internally and seems a bit too 'big' even for this.
>> 
>> I wonder if we can even add an optional endpoint for listing 'removed' 
>> snapshots.   I know it sounds weird, but when looking at metadata tables, 
>> the one question that I got a lot but could not answer is how to find when a 
>> data file is added (or a partition is added).  If the snapshot is expired 
>> then it is no longer possible to trace that history.  Users often expire 
>> snapshots to claw back disk space, but may necessarily want to delete the 
>> snapshot history.  But I believe the REST catalog seems to have an 
>> opportunity in removeSnapshot to preserve the metadata of the old snapshot 
>> (up to some configured time).  So we can query the snapshot metadata even 
>> after it expires, which I feel will be valuable.
>> 
>> Thanks
>> Szehon
>> 
>> 
>> On Wed, Jul 3, 2024 at 3:04 PM Jack Ye <yezhao...@gmail.com 
>> <mailto:yezhao...@gmail.com>> wrote:
>>> Hi Yufei,
>>> 
>>> Interesting that we are thinking about similar things. I had this item as a 
>>> part of the roadmap discussion items in the catalog sync meeting, and then 
>>> I removed it before the meeting because I felt it's too early to discuss.
>>> 
>>> My main concern for having server-side metadata tables is how we solve the 
>>> "big metadata" issue. The partitions, manifests, files table can easily 
>>> itself become a big table, and the REST server becomes inefficient in 
>>> retrieving results. It's the same old "HMS is too slow in iterating through 
>>> the partitions" problem. Iceberg kind of solves it by having this 
>>> information in Avro and in storage that can be scanned distributedly, but 
>>> with server-side metadata tables, we are technically re-introducing the 
>>> problem.
>>> 
>>> Maybe one potential approach is to run those potentially large metadata 
>>> table scans through the PreplanTable and PlanTable APIs. Just a quick 
>>> thought for now, I need to think a bit more about this.
>>> 
>>> Best,
>>> Jack Ye
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Jul 3, 2024 at 1:45 PM Yufei Gu <flyrain...@gmail.com 
>>> <mailto:flyrain...@gmail.com>> wrote:
>>>> Hi folks,
>>>> 
>>>> I'd like to discuss a new proposal to support server-side metadata tables.
>>>> 
>>>> One of Iceberg's most advantageous features is the ability to inspect a 
>>>> table using metadata tables. For instance, we can query snapshots just 
>>>> like we query data rows using the following command: SELECT * FROM 
>>>> prod.db.table.snapshots;
>>>> 
>>>> With the REST catalog, we can simplify this process further by providing 
>>>> metadata directly from REST endpoints. Here are several benefits of this 
>>>> approach:
>>>> Engine Independence: The metadata tables do not rely on a specific 
>>>> implementation of an engine. The REST server returns the results directly. 
>>>> For example, the Rust Iceberg does not need to implement its own logic to 
>>>> query the snapshot table if it connects to a server with this capability. 
>>>> This reduces the complexity and development effort required for different 
>>>> clients and engines.
>>>> Enabled New Use Cases: A catalog UI or Lakehouse UI can present a table's 
>>>> metadata (e.g., snapshot/partition list) without relying on an engine like 
>>>> Trino. This opens up possibilities for lightweight UIs and tools that can 
>>>> directly interact with the REST endpoints to retrieve and display metadata.
>>>> Enhanced Performance: With server-side caching, the server-side metadata 
>>>> tables will perform better. Caching reduces the need to repeatedly compute 
>>>> or retrieve metadata, leading to faster response times and reduced load on 
>>>> the underlying storage systems.
>>>> Here is the proposal in google doc: 
>>>> https://docs.google.com/document/d/1MVLwyMQtZ-7jewsQ0PuTvtJbpfl4HCoVdbowMqFTmfc/edit?usp=sharing
>>>> 
>>>> Estimated read time: 5 mins
>>>> 
>>>> Would really appreciate any feedback on this topic and proposal!
>>>> 
>>>> 
>>>> Yufei

Reply via email to