Hi All, We had a productive meeting today regarding the Relative Paths proposal.
We've reached a general agreement on the approach. The changes will involve explicitly defining path terminology (such as "absolute location") and should be well-contained within a new section on Table Spec. The next step is to open a PR with the proposed changes, which may include knock-on effects for the REST specification, such as updates to register table and load table requests. If you'd like to access the meeting notes: https://docs.google.com/document/d/1t0RxrK-nsCT83zXeD66kmGx_TMU2X8_xfN1A_k6dCV0/edit?usp=sharing You can find the recording here: https://drive.google.com/file/d/11q65achM_3vCfaEVYsxmfAdbKQJb2drA/view?usp=sharing Thanks for everyone Talat On Fri, Aug 1, 2025 at 10:50 AM Wing Yew Poon <[email protected]> wrote: > Dan, > Thanks for the clarifications. > Looking forward to the sync. > - Wing Yew > > > On Fri, Aug 1, 2025 at 8:43 AM Daniel Weeks <[email protected]> wrote: > >> Hey Wing Yu >> >> I see that you have been updating the Google doc containing the proposal. >> >> >> That's correct, I've been working with Talat to update the doc based on >> feedback from the comments and first round of discussion we had on this >> topic. >> >> Looking through it now, as far as I can tell, the basic idea (from the >>> original proposal) of inferring the table location from the path to the >>> current metadata.json has not changed. Is my reading correct? >> >> >> So far, nothing has changed about table location inference, but we will >> probably be revisiting this with respect to other updates/clarifications. >> There are still a couple open comments related to this point, but it is one >> of the main goals. >> >> You have added clarification around how the path to the metadata is >>> constructed from table location (from which the table location is thus >>> reverse engineered) and around path relativization, but the original idea >>> does not appear to have changed. In that case, the use case of having a >>> single copy of metadata but more than one copy of data (two or more >>> locations) is not supported by the proposal. This was the sticking point in >>> the last sync to discuss the proposal. >> >> >> I don't believe this was the sticking point from the original >> discussion. Having multiple copies/locations of the same data files under >> a single table's management is explicitly a non-goal. It was discussed in >> the comments of the doc for caching/fallback use cases, but I think that's >> better handled by specific engine/fileio implementations. >> >> The main sticking points were confusion around the complexity of how >> paths are constructed/persisted and the interplay between >> table/metadata/data locations depending on how those values are set in the >> table metadata. Based on that feedback, we're suggesting some changes, >> which is primarily consist of: 1) defining path construction, resolution, >> and relativization separately, 2) making all paths relative to the table >> location (which simplifies resolution/relativization, 3) address >> confusing/complex issues like path separators and expectations around >> separators. >> >> We're still in the process of updating the document, but we will schedule >> another sync to discuss these updates in detail and address a few points >> that are still outstanding. >> >> Thanks, >> Dan >> >> On Thu, Jul 31, 2025 at 5:47 PM Wing Yew Poon <[email protected]> >> wrote: >> >>> Hi Daniel Weeks, >>> I see that you have been updating the Google doc containing the proposal. >>> Looking through it now, as far as I can tell, the basic idea (from the >>> original proposal) of inferring the table location from the path to the >>> current metadata.json has not changed. Is my reading correct? >>> You have added clarification around how the path to the metadata is >>> constructed from table location (from which the table location is thus >>> reverse engineered) and around path relativization, but the original idea >>> does not appear to have changed. In that case, the use case of having a >>> single copy of metadata but more than one copy of data (two or more >>> locations) is not supported by the proposal. This was the sticking point in >>> the last sync to discuss the proposal. >>> Do you intend to have another sync to continue the discussion? >>> Thanks, >>> Wing Yew >>> >>> >>> On Thu, Jul 10, 2025 at 4:41 PM Anurag Mantripragada >>> <[email protected]> wrote: >>> >>>> Thanks Kevin, yes, I see the recording link too but don’t have access. >>>> I have requested access. >>>> >>>> >>>> ~ Anurag Mantripragada >>>> >>>> >>>> On Jul 10, 2025, at 2:43 PM, Kevin Liu <[email protected]> wrote: >>>> >>>> Yes it was recorded. Dan or Talat should have the recording. I see >>>> there's already a link for the recording associated with the gcal event but >>>> I dont have access to it. >>>> >>>> Best, >>>> Kevin Liu >>>> >>>> On Thu, Jul 10, 2025 at 12:37 PM Anurag Mantripragada >>>> <[email protected]> wrote: >>>> >>>>> Hey folks, was the sync recorded? I missed it due to calendar sync >>>>> issues :( >>>>> >>>>> >>>>> ~ Anurag Mantripragada >>>>> >>>>> On Jul 7, 2025, at 6:27 PM, ally heev <[email protected]> wrote: >>>>> >>>>> Thanks. I can see it now >>>>> >>>>> On Tue, Jul 8, 2025 at 12:37 AM Kevin Liu <[email protected]> >>>>> wrote: >>>>> >>>>>> >>>>>> I can see the new event on the dev calendar. >>>>>> [image: Screenshot 2025-07-07 at 12.04.08 PM.png] >>>>>> >>>>>> Subscribe to the "Iceberg Dev Events" calendar here: >>>>>> https://iceberg.apache.org/community/#iceberg-community-events >>>>>> >>>>>> Best, >>>>>> Kevin Liu >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Jul 7, 2025 at 11:38 AM Daniel Weeks <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hey Ally (and everyone else). >>>>>>> >>>>>>> We hadn't scheduled the discussion for relative paths, but I just >>>>>>> added an event to the dev calendar for Thursday at 9am (PT). >>>>>>> >>>>>>> Let me know if you still don't see it on the calendar. >>>>>>> >>>>>>> -Dan >>>>>>> >>>>>>> On Sat, Jul 5, 2025 at 9:37 PM Jean-Baptiste Onofré <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Talat >>>>>>>> >>>>>>>> Thanks for the update. I will do a new pass on the doc. >>>>>>>> >>>>>>>> Regards >>>>>>>> JB >>>>>>>> >>>>>>>> On Wed, May 28, 2025 at 12:13 AM Talat Uyarer >>>>>>>> <[email protected]> wrote: >>>>>>>> > >>>>>>>> > Hi, Iceberg Community, >>>>>>>> > >>>>>>>> > As mentioned at the last sync, Dan and I have been working on a >>>>>>>> proposal to add support for relative paths, which has been a long >>>>>>>> requested >>>>>>>> feature. There have been a number of discussions/proposals over the >>>>>>>> years, >>>>>>>> but we'd like to scope down and refocus effort to make some meaningful >>>>>>>> progress on this issue. >>>>>>>> > >>>>>>>> > Please take a look at the linked doc and provide feedback. We'd >>>>>>>> love to open up discussion on this topic at the next community sync >>>>>>>> and we >>>>>>>> can hold one-off syncs on the topic if there's a lot of interest. >>>>>>>> > >>>>>>>> > You can access Iceberg's First V4 Spec change from here :) >>>>>>>> > >>>>>>>> > Proposal Issue: https://github.com/apache/iceberg/issues/13141 >>>>>>>> > Doc: https://s.apache.org/iceberg-spec-relative-path >>>>>>>> > >>>>>>>> > Talat >>>>>>>> >>>>>>> >>>>> >>>>
