0:32 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
Hi,
Thanks for offering help, JB! I think the REST spec related part of the
proposal is quite simple, but since this is the first time I touch the spec,
let me reach out if I have any questions.
fc9110.html#field.etag . Using the metadata
>> >> location is likely the simplest option. For reference, based on the
>> >> grammar, ETag values cannot include spaces. Therefore, if the metadata
>> >> location contains spaces, it may need to be encoded. The same goes for
ss objects should not be shared, as per the current design.
Thank you for considering my suggestions.
Best regards,
Taeyun
-Original Message-
From: "Yufei Gu"
To: ;
Cc:
Sent: 2025-01-04 (토) 10:21:17 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer
values cannot include spaces. Therefore, if the metadata
>> location contains spaces, it may need to be encoded. The same goes for
>> double quotation marks. (I just found this out after looking it up.)
>> >> Anyway, in my opinion, the client must ignore any semantic meani
marks. (I just found this out after looking it up.)
> >> Anyway, in my opinion, the client must ignore any semantic meaning
> associated with the value.
> >>
> >> Thank you.
> >>
> >> -Original Message-
> >> From: &quo
ed. The same goes for double
>> quotation marks. (I just found this out after looking it up.)
>> Anyway, in my opinion, the client must ignore any semantic meaning
>> associated with the value.
>>
>> Thank you.
>>
>> -----Original Message-----
>> F
nal Message-
> From: "Zoltán Borók-Nagy"
> To: ;
> Cc:
> Sent: 2024-11-22 (금) 19:57:08 (UTC+09:00)
> Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
>
> Hi,
>
> Separate version information forces the clients to manage a Table
4-11-22 (금) 19:57:08 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
Hi,
Separate version information forces the clients to manage a Table ->
VersionIdentifier mapping which adds unnecessary complexity and can be
error-prone.
If the VersionIdentifier i
to an older REST catalog server that doesn’t support the new
>> freshness checking specification. The client may not have the authority to
>> upgrade the server. BTW, the RESTClient can determine that the server
>> doesn’t support freshness checks based on the absence of these HTTP
; - On ETag Content:
>
> The server is free to assign any value to the ETag. This means the client
> should not attempt to interpret the content of the ETag.
> As I mentioned before, if the REST catalog API uses ETags, it’s essential
> that no semantic meaning is attributed to their values
t; > On the other hand, the proposed new function signature doesn’t seem to
> > provide a way for the caller to supply ETags (or equivalent identifiers
> > representing specific table versions for other catalog types). Is such
> > information intended to be embedded within the Table stru
> On the Proposal:
> > >
> > > I agree that the current function (loadTable(TableIdentifier)) cannot
> be freshness-aware. This is expected, as the caller doesn’t provide the
> version it holds, leaving the callee with no basis for comparison.
> > >
ther hand, the proposed new function signature doesn’t seem to
> > provide a way for the caller to supply ETags (or equivalent identifiers
> > representing specific table versions for other catalog types). Is such
> > information intended to be embedded within the Table structure?
To me, it seems clearer to explicitly provide such information (like ETags)
> rather than embedding it in the Table structure. That said, I might be
> misunderstanding the intention here.
>
> Thank you.
>
>
> -Original Message-
> From: "Gabor Kaszab"
> To: ;
>
I might be
misunderstanding the intention here.
Thank you.
-Original Message-
From: "Gabor Kaszab"
To: ;
Cc:
Sent: 2024-11-19 (화) 21:26:01 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
Hi,
Thanks for sharing your view, Taeyun! I think ther
turns None (= not
> updated). If the tag is None or does not match the latest table tag, the
> API returns a new (Table, tag) pair.
>
> Thank you.
>
>
> -Original Message-
> From: "Zoltán Borók-Nagy"
> To: ;
> Cc:
> Sent: 2024-11-19 (화) 03:16:05
Cache-Control values in the examples above are intended to ensure that
> the client validates freshness with the server on every request. Writing
> the header in this extended format is primarily to accommodate outdated
> HTTP/1.1 implementations. However, under the HTTP/1.1 specificatio
PI returns a new (Table,
tag) pair.
Thank you.
-Original Message-
From: "Zoltán Borók-Nagy"
To: ;
Cc:
Sent: 2024-11-19 (화) 03:16:05 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
Hey Everyone,
Thanks Gábor, I think the proposed int
ce the Iceberg REST catalog server is effectively a type of HTTP
>> server, at least in theory, it may be expected to handle HTTP cache and
>> validation-related processes. The header approach can be seen as leveraging
>> this mechanism appropriately.
>> - The header approach doe
e} endpoint. It could also
> be applied to all GET-based endpoints, though this might broaden the scope
> significantly.
>
> Thank you.
>
>
>
> -Original Message-
> From: "Shani Elharrar"
> To: ;
> Cc: ;
> Sent: 2024-11-18 (월) 16:21:16 (UTC+
etching updated data only if there are modifications.>> What do you think about defining the spec in this direction?>> Thank you.>>>>> -Original Message-> From: "Yufei Gu" <flyrain...@gmail.com>> To: <dev@iceberg.apache.org>;> Cc:> S
do you think about defining the spec in this direction?>> Thank
you.>>>>> -Original Message-> From: "Yufei Gu"
<flyrain...@gmail.com mailto:flyrain...@gmail.com>>; To:
<dev@iceberg.apache.org mailto:dev@iceberg.apache.org>;>; Cc:>
-Original Message-
From: "Yufei Gu" <flyrain...@gmail.com>
To: <dev@iceberg.apache.org>;
Cc:
Sent: 2024-11-16 (토) 02:51:05 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
How does HTTP caching handle desynchronized clocks between client
TTP includes the If-Modified-Since header and the
> 304 Not Modified status code. Using this approach, we could achieve data
> freshness with a single round-trip, fetching updated data only if there are
> modifications.
>
> What do you think about defining the spec in this direction?
is function seems to serve a
>> different purpose.
>> >
>> > Here is my suggestion:
>> >
>> > Since HTTP has built-in caching features (
>> https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching), and REST
>> catalogs operate over HTTP, it seems
ed data only if there are
> modifications.
> >
> > What do you think about defining the spec in this direction?
> >
> > Thank you.
> >
> >
> >
> >
> > -Original Message-
> > From: "Yufei Gu"
> > To: ;
> > Cc:
is direction?
>
> Thank you.
>
>
>
>
> -----Original Message-----
> From: "Yufei Gu"
> To: ;
> Cc:
> Sent: 2024-11-13 (수) 03:43:24 (UTC+09:00)
> Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
>
>
>
> Hi Gamber,
>
le round-trip, fetching updated data only if there are
> modifications.
>
> What do you think about defining the spec in this direction?
>
> Thank you.
>
>
>
>
> -Original Message-
> From: "Yufei Gu"
> To: ;
> Cc:
> Sent: 2024-11-13 (수) 03:43:24 (
3 (수) 03:43:24 (UTC+09:00)
Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest
Hi Gamber,
Thanks for the proposal! Impala isn’t unique in needing this—I've seen similar
requirements from other engines.
As others pointed out, using the “tableExists” endpoint see
Hi Gamber,
Thanks for the proposal! Impala isn’t unique in needing this—I've seen
similar requirements from other engines.
As others pointed out, using the “tableExists” endpoint seems like a
workaround. I don't consider it a permanent way forward. We could address
this by either modifying the cu
Hi Fokko
I like the idea, but I think it's more a workaround and could be
confusing for users :)
Regards
JB
On Tue, Nov 12, 2024 at 2:53 PM Fokko Driesprong wrote:
>
> Hey Gabor,
>
> Thanks for raising this. While reading this, my first thought is to leverage
> the `tableExists` operation:
> h
Hi Gabor,
I'm going to propose something that does not quitte align with your idea,
so please bear with me.
Both options in you email (A and B) assume that the engine is going to make
freshness decisions based on the location of metadata. I see some
conceptual rough edges here.
A change in metad
Thanks for the answers so far!
Fokko, I think your suggestion makes sense, however, I feel that a
'tableExists' call returning the metadata path is kind of a side effect of
an operation and not something users would expect. Having an 'isLatest' or
'metadataLocation' operations seem cleaner and mor
Hey Gabor,
Thanks for raising this. While reading this, my first thought is to
leverage the `tableExists` operation:
https://github.com/apache/iceberg/blob/e3f39972863f891481ad9f5a559ffef093976bd7/open-api/rest-catalog-open-api.yaml#L1129-L1160
This doesn't return anything today, but we could ret
I recommend option (b), provided there is no partial metadata loading. We
implemented option (b) internally to facilitate partial metadata loading, as we
have tables with hundreds of thousands of snapshots. This results in metadata
that occupies approximately 500 MB in memory (excluding the Json
Hi Gabor,
I think it's a bit related to the discussion about "partial metadata
retrieval" we have (as you said).
We don't yet have a consensus about this discussion and it's a pretty
large proposal.
I have a preference for isLatest() as it doesn't overlap with
filtering table metadata (that we ca
36 matches
Mail list logo