Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-07 Thread Taeyun Kim
0:32 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest Hi, Thanks for offering help, JB! I think the REST spec related part of the proposal is quite simple, but since this is the first time I touch the spec, let me reach out if I have any questions.

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-06 Thread Jean-Baptiste Onofré
fc9110.html#field.etag . Using the metadata >> >> location is likely the simplest option. For reference, based on the >> >> grammar, ETag values cannot include spaces. Therefore, if the metadata >> >> location contains spaces, it may need to be encoded. The same goes for

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-05 Thread Taeyun Kim
ss objects should not be shared, as per the current design. Thank you for considering my suggestions. Best regards, Taeyun -Original Message- From: "Yufei Gu" To: ; Cc: Sent: 2025-01-04 (토) 10:21:17 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2025-01-03 Thread Yufei Gu
values cannot include spaces. Therefore, if the metadata >> location contains spaces, it may need to be encoded. The same goes for >> double quotation marks. (I just found this out after looking it up.) >> >> Anyway, in my opinion, the client must ignore any semantic meani

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-12-19 Thread Gabor Kaszab
marks. (I just found this out after looking it up.) > >> Anyway, in my opinion, the client must ignore any semantic meaning > associated with the value. > >> > >> Thank you. > >> > >> -Original Message- > >> From: &quo

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-12-12 Thread Jean-Baptiste Onofré
ed. The same goes for double >> quotation marks. (I just found this out after looking it up.) >> Anyway, in my opinion, the client must ignore any semantic meaning >> associated with the value. >> >> Thank you. >> >> -----Original Message----- >> F

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-12-12 Thread Gabor Kaszab
nal Message- > From: "Zoltán Borók-Nagy" > To: ; > Cc: > Sent: 2024-11-22 (금) 19:57:08 (UTC+09:00) > Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest > > Hi, > > Separate version information forces the clients to manage a Table

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-22 Thread Taeyun Kim
4-11-22 (금) 19:57:08 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest Hi, Separate version information forces the clients to manage a Table -> VersionIdentifier mapping which adds unnecessary complexity and can be error-prone. If the VersionIdentifier i

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-22 Thread Zoltán Borók-Nagy
to an older REST catalog server that doesn’t support the new >> freshness checking specification. The client may not have the authority to >> upgrade the server. BTW, the RESTClient can determine that the server >> doesn’t support freshness checks based on the absence of these HTTP

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-22 Thread Gabor Kaszab
; - On ETag Content: > > The server is free to assign any value to the ETag. This means the client > should not attempt to interpret the content of the ETag. > As I mentioned before, if the REST catalog API uses ETags, it’s essential > that no semantic meaning is attributed to their values

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Taeyun Kim
t; > On the other hand, the proposed new function signature doesn’t seem to > > provide a way for the caller to supply ETags (or equivalent identifiers > > representing specific table versions for other catalog types). Is such > > information intended to be embedded within the Table stru

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Gabor Kaszab
> On the Proposal: > > > > > > I agree that the current function (loadTable(TableIdentifier)) cannot > be freshness-aware. This is expected, as the caller doesn’t provide the > version it holds, leaving the callee with no basis for comparison. > > >

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Zoltán Borók-Nagy
ther hand, the proposed new function signature doesn’t seem to > > provide a way for the caller to supply ETags (or equivalent identifiers > > representing specific table versions for other catalog types). Is such > > information intended to be embedded within the Table structure?

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-21 Thread Zoltán Borók-Nagy
To me, it seems clearer to explicitly provide such information (like ETags) > rather than embedding it in the Table structure. That said, I might be > misunderstanding the intention here. > > Thank you. > > > -Original Message- > From: "Gabor Kaszab" > To: ; >

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-20 Thread Taeyun Kim
I might be misunderstanding the intention here. Thank you. -Original Message- From: "Gabor Kaszab" To: ; Cc: Sent: 2024-11-19 (화) 21:26:01 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest Hi, Thanks for sharing your view, Taeyun! I think ther

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-19 Thread Gabor Kaszab
turns None (= not > updated). If the tag is None or does not match the latest table tag, the > API returns a new (Table, tag) pair. > > Thank you. > > > -Original Message- > From: "Zoltán Borók-Nagy" > To: ; > Cc: > Sent: 2024-11-19 (화) 03:16:05

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-18 Thread Yufei Gu
Cache-Control values in the examples above are intended to ensure that > the client validates freshness with the server on every request. Writing > the header in this extended format is primarily to accommodate outdated > HTTP/1.1 implementations. However, under the HTTP/1.1 specificatio

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-18 Thread Taeyun Kim
PI returns a new (Table, tag) pair. Thank you. -Original Message- From: "Zoltán Borók-Nagy" To: ; Cc: Sent: 2024-11-19 (화) 03:16:05 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest Hey Everyone, Thanks Gábor, I think the proposed int

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-18 Thread Zoltán Borók-Nagy
ce the Iceberg REST catalog server is effectively a type of HTTP >> server, at least in theory, it may be expected to handle HTTP cache and >> validation-related processes. The header approach can be seen as leveraging >> this mechanism appropriately. >> - The header approach doe

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-18 Thread Gabor Kaszab
e} endpoint. It could also > be applied to all GET-based endpoints, though this might broaden the scope > significantly. > > Thank you. > > > > -Original Message- > From: "Shani Elharrar" > To: ; > Cc: ; > Sent: 2024-11-18 (월) 16:21:16 (UTC+

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-17 Thread Shani Elharrar
etching updated data only if there are modifications.>> What do you think about defining the spec in this direction?>> Thank you.>>>>> -Original Message-> From: "Yufei Gu" <flyrain...@gmail.com>> To: <dev@iceberg.apache.org>;> Cc:> S

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-17 Thread Taeyun Kim
do you think about defining the spec in this direction?>> Thank you.>>>>> -Original Message-> From: "Yufei Gu" <flyrain...@gmail.com mailto:flyrain...@gmail.com>>; To: <dev@iceberg.apache.org mailto:dev@iceberg.apache.org>;>; Cc:>

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-17 Thread Shani Elharrar
-Original Message- From: "Yufei Gu" <flyrain...@gmail.com> To: <dev@iceberg.apache.org>; Cc: Sent: 2024-11-16 (토) 02:51:05 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest How does HTTP caching handle desynchronized clocks between client

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-17 Thread Taeyun Kim
TTP includes the If-Modified-Since header and the > 304 Not Modified status code. Using this approach, we could achieve data > freshness with a single round-trip, fetching updated data only if there are > modifications. > > What do you think about defining the spec in this direction?

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Yufei Gu
is function seems to serve a >> different purpose. >> > >> > Here is my suggestion: >> > >> > Since HTTP has built-in caching features ( >> https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching), and REST >> catalogs operate over HTTP, it seems

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Gabor Kaszab
ed data only if there are > modifications. > > > > What do you think about defining the spec in this direction? > > > > Thank you. > > > > > > > > > > -Original Message- > > From: "Yufei Gu" > > To: ; > > Cc:

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Jean-Baptiste Onofré
is direction? > > Thank you. > > > > > -----Original Message----- > From: "Yufei Gu" > To: ; > Cc: > Sent: 2024-11-13 (수) 03:43:24 (UTC+09:00) > Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest > > > > Hi Gamber, >

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-15 Thread Fokko Driesprong
le round-trip, fetching updated data only if there are > modifications. > > What do you think about defining the spec in this direction? > > Thank you. > > > > > -Original Message- > From: "Yufei Gu" > To: ; > Cc: > Sent: 2024-11-13 (수) 03:43:24 (

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-14 Thread Taeyun Kim
3 (수) 03:43:24 (UTC+09:00) Subject: Re: [DISCUSS] REST: Way to query if metadata pointer is the latest Hi Gamber, Thanks for the proposal! Impala isn’t unique in needing this—I've seen similar requirements from other engines. As others pointed out, using the “tableExists” endpoint see

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Yufei Gu
Hi Gamber, Thanks for the proposal! Impala isn’t unique in needing this—I've seen similar requirements from other engines. As others pointed out, using the “tableExists” endpoint seems like a workaround. I don't consider it a permanent way forward. We could address this by either modifying the cu

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Jean-Baptiste Onofré
Hi Fokko I like the idea, but I think it's more a workaround and could be confusing for users :) Regards JB On Tue, Nov 12, 2024 at 2:53 PM Fokko Driesprong wrote: > > Hey Gabor, > > Thanks for raising this. While reading this, my first thought is to leverage > the `tableExists` operation: > h

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Dmitri Bourlatchkov
Hi Gabor, I'm going to propose something that does not quitte align with your idea, so please bear with me. Both options in you email (A and B) assume that the engine is going to make freshness decisions based on the location of metadata. I see some conceptual rough edges here. A change in metad

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Gabor Kaszab
Thanks for the answers so far! Fokko, I think your suggestion makes sense, however, I feel that a 'tableExists' call returning the metadata path is kind of a side effect of an operation and not something users would expect. Having an 'isLatest' or 'metadataLocation' operations seem cleaner and mor

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Fokko Driesprong
Hey Gabor, Thanks for raising this. While reading this, my first thought is to leverage the `tableExists` operation: https://github.com/apache/iceberg/blob/e3f39972863f891481ad9f5a559ffef093976bd7/open-api/rest-catalog-open-api.yaml#L1129-L1160 This doesn't return anything today, but we could ret

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Shani Elharrar
I recommend option (b), provided there is no partial metadata loading. We implemented option (b) internally to facilitate partial metadata loading, as we have tables with hundreds of thousands of snapshots. This results in metadata that occupies approximately 500 MB in memory (excluding the Json

Re: [DISCUSS] REST: Way to query if metadata pointer is the latest

2024-11-12 Thread Jean-Baptiste Onofré
Hi Gabor, I think it's a bit related to the discussion about "partial metadata retrieval" we have (as you said). We don't yet have a consensus about this discussion and it's a pretty large proposal. I have a preference for isLatest() as it doesn't overlap with filtering table metadata (that we ca