Hi All, Want to summarize the thread here:
For generic tables, we will add a `location` key to help cross engine sharing and future support for credential vending. Here is a description about this `location` key and corresponding restrictions and responsibilities: - `location`(OPTIONAL): table root location in URI format. For example: s3://<my-bucket>/path/to/table. - The table root location is a location that includes all files for the table. - Clients (engines) are responsible to make sure all files are written under the configured location. - A table with multiple root locations (i.e. containing files that are outside the configured root location) is not compliant with the current generic table support in Polaris. - No two tables can have the same or overlapped location, otherwise, a ForbiddenException will be thrown on creation. - If no location is provided, clients or users are responsible to manage the location and location related concerns such as path conflict check etc. - The location configuration can not be updated once the table is created. This description will be added into the spec. In order to help non-API users to discover the information easily, we will also get a site page to describe the support for Generic Table and key fields. Best Regards, Yun On Mon, May 19, 2025 at 11:16 PM yun zou <yunzou.colost...@gmail.com> wrote: > Hi Dmitri, > > " I do not think those doc comments provide enough visibility to ensure > that the key information > is received by users, unless they are dealing directly with the API" > -- Yeah, I agree those information may not be visible enough for users who > don't directly work with APIs. > However, I think just having one page for "location" might be a little bit > overkill. Given that generic table API support is > a new catalog capabilities that Polaris added which is not IRC, I think it > might worth having a more general page to > describe the Polaris Generic Table support and describe some of the > critical fields like *location*. > I think we should have the description in the spec also, so that things > could be clear for API users. > > Please let me know what you think. > > Best Regards, > Yun > > On Mon, May 19, 2025 at 4:22 PM Dmitri Bourlatchkov <di...@apache.org> > wrote: > >> I believe the Open API spec and the definition of "location" are slightly >> different concerns. >> >> The former is about the API used to obtain information about Generic >> Tables. >> >> The latter is about the interpretation of that information. One can think >> of the location >> value being handled / transferred beyond the immediate Polaris client, in >> which case >> is loses its connection to the API, but does not lose its meaning as a >> location of a >> Generic Table. >> >> Also, I think that Open API doc comments are too low-level and too obscure >> for >> people who will work with processing actual Generic Table files. I do not >> think >> those doc comment provide enough visibility to ensure that the key >> information >> is received by users, unless they are dealing directly with the API. >> >> That said, if you prefer to keep the finer points about Generic Table >> locations in the >> Open API spec, I'd be fine with that. >> >> Cheers, >> Dmitri. >> >> On Mon, May 19, 2025 at 6:46 PM yun zou <yunzou.colost...@gmail.com> >> wrote: >> >> > Hi Dmitri, >> > >> > Thanks for the detailed explanation, I definitely agree we need to call >> out >> > those restrictions and compliance in our Spec. >> > >> > As for the documentation, Polaris today already publishes the API spec, >> if >> > you go to page https://polaris.apache.org/in-dev/unreleased/, >> > and click on the Catalog API Spec, it will lead you to the published >> Spec, >> > which contains all description in the Spec. >> > That basically means we have both published doc and spec code, and the >> > single source of truth is the description in the doc. >> > or do you think we should have an extra page for the Generic Table API >> > spec? >> > >> > Best Regards, >> > Yun >> > >> > On Mon, May 19, 2025 at 3:20 PM Yufei Gu <flyrain...@gmail.com> wrote: >> > >> > > > >> > > > * Clients (engines) are responsible for writing files only under the >> > > > specified location. >> > > >> > > It's nice to have a doc like that. But the open API spec is *the* >> place >> > to >> > > define the behavior of client and server, and how they interact with >> each >> > > other. Just as we said before, spec change is recommended to have a ML >> > > discussion. >> > > >> > > * A table, whose files exist outside the declared location, is not >> > > > compliant with the Polaris' definition for a Generic Table. >> > > >> > > I'm not sure we should go that far. "location" is an optional field. >> It's >> > > just some features like credential vending that don't work if >> "location" >> > is >> > > missing. >> > > >> > > Yufei >> > > >> > > >> > > On Mon, May 19, 2025 at 2:59 PM Dmitri Bourlatchkov <di...@apache.org >> > >> > > wrote: >> > > >> > > > As I commented in my other recent email, I think by introducing a >> > > > "location" property Polaris enters the realm of table format specs. >> > > > >> > > > This is fine, from my POV, however, since Polaris is the defining >> > project >> > > > behind that property, I believe Polaris should provide a more >> > definitive >> > > > description of the meaning and intended processing of that property. >> > > > >> > > > To repeat myself, I think the Open API spec defines only the API for >> > > > obtaining the location. We need a place to define what this location >> > > means. >> > > > I do not insist on calling this a "spec" for Generic Tables, but I >> > think >> > > it >> > > > deserves a separate page in Polaris docs, where it would be defined >> > with >> > > > more rigor. >> > > > >> > > > Specifically, I think we need to call out that: >> > > > * The location is a base URI (essentially prefix) for all files in a >> > > > generic table. >> > > > * Clients (engines) are responsible for writing files only under the >> > > > specified location. >> > > > * A table, whose files exist outside the declared location, is not >> > > > compliant with the Polaris' definition for a Generic Table. >> > > > >> > > > By extension, I think we ought to describe other existing properties >> > too. >> > > > >> > > > WDYT? >> > > > >> > > > Thanks, >> > > > Dmitri. >> > > > >> > > > On Mon, May 19, 2025 at 5:39 PM yun zou <yunzou.colost...@gmail.com >> > >> > > > wrote: >> > > > >> > > > > Hi Dmitri, >> > > > > >> > > > > I think for Iceberg, we all agreed that there can be multiple >> > > locations, >> > > > > and I definitely agree with Russel that the extension >> > > > > should be done with the IRC endpoints. The Generic Table APIs are >> > > > designed >> > > > > for non-Iceberg table usage today, and >> > > > > We still want Iceberg table usage to go through the IRC endpoint >> to >> > > have >> > > > > full IRC support. >> > > > > >> > > > > As for the following point >> > > > > "a more strict spec for that (define where file should and should >> not >> > > > go)" >> > > > > Are you referring that Polaris need to generate a location for the >> > > table >> > > > to >> > > > > use, if that is the case, I don't think engines >> > > > > respects that today. The table locations are either generated by >> the >> > > > engine >> > > > > or specified by the user. >> > > > > Or are you referring that we should have something like Iceberg >> that >> > we >> > > > > should have an allowed location and do a >> > > > > validation to make sure the location is under the allowed >> location? >> > > Would >> > > > > you mind elaborate more on this point? >> > > > > >> > > > > Best Regards, >> > > > > Yun >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > On Mon, May 19, 2025 at 1:45 PM Russell Spitzer < >> > > > russell.spit...@gmail.com >> > > > > > >> > > > > wrote: >> > > > > >> > > > > > Yeah I think Iceberg and Hive are the only ones trying to make >> life >> > > > > > difficult, that I think >> > > > > > we should also cover but in changes to the Iceberg Spec. Hive >> can >> > > just >> > > > > stay >> > > > > > how it is ... >> > > > > > >> > > > > > On Mon, May 19, 2025 at 2:59 PM Dmitri Bourlatchkov < >> > > di...@apache.org> >> > > > > > wrote: >> > > > > > >> > > > > > > For context: my locations concerns are rooted in Nessie's >> > > experience >> > > > > > where >> > > > > > > we often get problem reports related to files being outside >> the >> > > > > declared >> > > > > > > Iceberg metadata location. >> > > > > > > >> > > > > > > Example: >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/projectnessie/nessie/issues/10817#issuecomment-2887329227 >> > > > > > > >> > > > > > > I'm ok going with a single location for generic tables, but I >> > think >> > > > > > Polaris >> > > > > > > needs to have a more strict spec for that (define where file >> > should >> > > > and >> > > > > > > should not go) because polaris owns this spec. Polaris ought >> to >> > > > define >> > > > > > what >> > > > > > > complies with the spec and what does not. Having a proper >> spec is >> > > > > > essential >> > > > > > > to ensure a mutual understanding of all parties dealing with >> > > Generic >> > > > > > > Tables. >> > > > > > > >> > > > > > > Open API yaml comments are not sufficient, IMHO. I'd prefer to >> > > have a >> > > > > > > dedicated doc page to define expectations and compliance. >> > > > > > > >> > > > > > > Thanks, >> > > > > > > Dmitri. >> > > > > > > >> > > > > > > >> > > > > > > On Mon, May 19, 2025 at 2:17 PM Russell Spitzer < >> > > > > > russell.spit...@gmail.com >> > > > > > > > >> > > > > > > wrote: >> > > > > > > >> > > > > > > > The only multiple locations table formats I'm currently >> aware >> > of >> > > > are >> > > > > > Hive >> > > > > > > > (partitions can live wherever) and Iceberg. >> > > > > > > > >> > > > > > > > I think for Delta, Hudi, LanceDB, Paimon and File based >> tables >> > > > they >> > > > > > all >> > > > > > > > have to live in the root location. I'm not sure of any other >> > > "file" >> > > > > > based >> > > > > > > > tables where this would be an issue but I'd love to know if >> > > someone >> > > > > > else >> > > > > > > > has ideas. I think with the rise in credential vending, >> > splitting >> > > > > > things >> > > > > > > > amongst multiple prefixes is becoming less common. I don't >> > oppose >> > > > > doing >> > > > > > > an >> > > > > > > > array of locations but it may be enough to just leave this >> as >> > an >> > > > > > > extension >> > > > > > > > later. (Support location or locations) >> > > > > > > > >> > > > > > > > On Wed, May 7, 2025 at 8:52 PM yun zou < >> > > yunzou.colost...@gmail.com >> > > > > >> > > > > > > wrote: >> > > > > > > > >> > > > > > > > > Hi Dmitri, >> > > > > > > > > >> > > > > > > > > If it's not "all" is it not strong enough for a spec, >> IMHO. >> > If >> > > > some >> > > > > > > > tables >> > > > > > > > > have multiple base locations how is Polaris going to deal >> > with >> > > > > them? >> > > > > > > > > >> > > > > > > > > Sorry, when I say most of them, it was because I haven't >> > tested >> > > > all >> > > > > > of >> > > > > > > > them >> > > > > > > > > (I only tested Delta and CSV before). >> > > > > > > > > However, if Unity Catalog is only taking one location, I >> > think >> > > > that >> > > > > > is >> > > > > > > a >> > > > > > > > > strong enough proof that >> > > > > > > > > one location is enough today. >> > > > > > > > > >> > > > > > > > > It is also more natural to start with one location, and if >> > > there >> > > > > are >> > > > > > > use >> > > > > > > > > cases that >> > > > > > > > > require support for multiple locations later, we can move >> on >> > to >> > > > V2 >> > > > > > spec >> > > > > > > > to >> > > > > > > > > support multiple >> > > > > > > > > tables locations. >> > > > > > > > > >> > > > > > > > > We're making a specification for Polaris. I do not think >> it >> > is >> > > > > > > sufficient >> > > > > > > > > to say we'll do the same as other (unspecified ATM) >> catalogs. >> > > > > > > > > If we want to migrate users from other Catalog services to >> > > > Polaris >> > > > > > > > (through >> > > > > > > > > federation), then Polaris will need to >> > > > > > > > > provide corresponding capabilities. For example, Unity >> > Catalog >> > > > > > storage >> > > > > > > > > location is a URI representation, when entity >> > > > > > > > > are federated from Unity Catalog, we will need to be able >> to >> > > > handle >> > > > > > the >> > > > > > > > URI >> > > > > > > > > location. >> > > > > > > > > If URI representation is a common standard that has been >> > > accepted >> > > > > by >> > > > > > > > other >> > > > > > > > > Catalog services like Unity Catalog, Gravitino, >> > > > > > > > > Polaris should be compatible with that, otherwise it might >> > > cause >> > > > > > > problem >> > > > > > > > > for users when they are migrating from one to >> > > > > > > > > another. >> > > > > > > > > >> > > > > > > > > What will Polaris Server do with this location? >> > > > > > > > > For generic tables, Polaris will provide credential >> vending >> > for >> > > > > this >> > > > > > > > > location in near future, I don't see we will provide >> > > > > > > > > anything else in short or mid term, since we still want to >> > > > promote >> > > > > > > > > native support for Iceberg. >> > > > > > > > > Or if you have anything special in your mind that you >> think >> > we >> > > > > should >> > > > > > > > > support? >> > > > > > > > > >> > > > > > > > > If Polaris has to define it in a spec, it will be hard to >> > > change >> > > > in >> > > > > > the >> > > > > > > > > future. >> > > > > > > > > Regardless of whether it is explicitly in the spec >> definition >> > > or >> > > > > as a >> > > > > > > > > reserved property key, as long as they are explicitly >> > > > > > > > > documented, they will be hard to change in the future. >> From >> > > that >> > > > > > > > > perspective, those two approaches seem the same to me. >> > > > > > > > > >> > > > > > > > > Table location is critical information that is required by >> > the >> > > > > engine >> > > > > > > > side >> > > > > > > > > to read and write the tables, which should >> > > > > > > > > be explicitly defined to provide better sharing across >> > engines. >> > > > For >> > > > > > > > > example, the delta table location is passed in the >> > > > > > > > > table properties with a property key either "location" or >> > > "path" >> > > > > > > depends >> > > > > > > > on >> > > > > > > > > how the table is created. Now, if another >> > > > > > > > > engine wants to read the delta table, it will need to >> > > understand >> > > > > > those >> > > > > > > > > keys, which are controlled by Spark today. If Spark >> > > > > > > > > changes them one day, all sharing will stop working. >> > > > > > > > > >> > > > > > > > > As to whether we want to put it as an explicit field or a >> > > > reserved >> > > > > > > key, I >> > > > > > > > > think for a common field among various >> > > > > > > > > table formats, it makes more sense to have it as an >> explicit >> > > > field. >> > > > > > For >> > > > > > > > > properties that are specific to a particular table format, >> > > > > > > > > it is more proper to just have a reserved key. >> > > > > > > > > >> > > > > > > > > If Polaris takes control of the location, I think we have >> to >> > be >> > > > > more >> > > > > > > > > careful >> > > > > > > > > and at least try to make it future-proof. >> > > > > > > > > >> > > > > > > > > I don't think Polaris is taking control of the location, >> the >> > > > > location >> > > > > > > is >> > > > > > > > > still controlled by the engine and users today like table >> > > names. >> > > > > > > > > Polaris is a Catalog service, it records the generic table >> > > > entity, >> > > > > > and >> > > > > > > > > returns the information back to the user on query. >> > > > > > > > > It might be able to do some validation on the location >> (like >> > > > check >> > > > > > > > special >> > > > > > > > > character), but it doesn't decide which location >> > > > > > > > > the table will be used. I personally don't think it is a >> bad >> > > idea >> > > > > to >> > > > > > > let >> > > > > > > > > the Catalog service also take control of generating >> > > > > > > > > the table location, but I think that will require a lot of >> > > work. >> > > > > > > > > >> > > > > > > > > Best Regards, >> > > > > > > > > Yun >> > > > > > > > > >> > > > > > > > > On Wed, May 7, 2025 at 5:22 PM Dmitri Bourlatchkov < >> > > > > di...@apache.org >> > > > > > > >> > > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > No worries about the name. It is a possible alternative >> > > > spelling >> > > > > :) >> > > > > > > > > > >> > > > > > > > > > On Wed, May 7, 2025 at 8:04 PM yun zou < >> > > > > yunzou.colost...@gmail.com >> > > > > > > >> > > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > Hi Dmitri, >> > > > > > > > > > > >> > > > > > > > > > > Sorry, I accidentally typed your name wrong in the >> > previous >> > > > > > reply! >> > > > > > > > > > > Apologize for this! >> > > > > > > > > > > >> > > > > > > > > > > For the S3 issue, I think we will need to deal with >> those >> > > > > > > regardless, >> > > > > > > > > > > especially with the federation work going on, we will >> > need >> > > to >> > > > > > > handle >> > > > > > > > > all >> > > > > > > > > > > those entities eventually coming from different >> Catalogs, >> > > and >> > > > > the >> > > > > > > URI >> > > > > > > > > > > format seems the standard format used by various >> Catalog >> > > > > > services. >> > > > > > > > > > > >> > > > > > > > > > > Best Regards, >> > > > > > > > > > > Yun >> > > > > > > > > > > >> > > > > > > > > > > On Wed, May 7, 2025 at 4:55 PM yun zou < >> > > > > > yunzou.colost...@gmail.com >> > > > > > > > >> > > > > > > > > > wrote: >> > > > > > > > > > > >> > > > > > > > > > > > Hi Dimitri and Eric, >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks a lot for the feedback! >> > > > > > > > > > > > >> > > > > > > > > > > > For the questions: >> > > > > > > > > > > > - Is one value or many? >> > > > > > > > > > > > It will be one value, similar to the location in >> > Iceberg >> > > > and >> > > > > > the >> > > > > > > > > > > > storage_location in unity catalog. >> > > > > > > > > > > > >> > > > > > > > > > > > Regarding to the point about having new data in new >> > > > locations >> > > > > > and >> > > > > > > > > > keeping >> > > > > > > > > > > > old data in old locations, do we support that for >> > Iceberg >> > > > > > > > > > > > today? >> > > > > > > > > > > > For most of the Spark tables, it seems to only have >> one >> > > > > > location. >> > > > > > > > > > Also, I >> > > > > > > > > > > > think it is better to start restricted first, and >> then >> > > > extend >> > > > > > it >> > > > > > > to >> > > > > > > > > > > > allow multiple locations when the use case raises. >> > > > > > > > > > > > >> > > > > > > > > > > > Ref: >> > > > > > > > > > > > Iceberg location: >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L3451 >> > > > > > > > > > > > Storage location in Unity Catalog: >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/iceberg/blob/main/open-api/rest-catalog-open-api.yaml#L3451 >> > > > > > > > > > > > >> > > > > > > > > > > > - Is it a URI? >> > > > > > > > > > > > Yes, it will be a URI, which seems the standard >> catalog >> > > > > > > > > implementation. >> > > > > > > > > > > > Regarding to the point about s3 v2 s3a, i assume >> that >> > is >> > > a >> > > > > > common >> > > > > > > > > > > > problem that every catalog implementation needs to >> > > address, >> > > > > and >> > > > > > > we >> > > > > > > > > will >> > > > > > > > > > > > stay the same on this part. At least from the load >> > table >> > > > > point >> > > > > > of >> > > > > > > > > view, >> > > > > > > > > > > > Spark engine knows how to deal with such cases. >> > > > > > > > > > > > >> > > > > > > > > > > > - Does it point to any particular file? >> > > > > > > > > > > > No, it doesn't point to a particular file. It is the >> > base >> > > > > table >> > > > > > > > > > location. >> > > > > > > > > > > > >> > > > > > > > > > > > - Is it a common prefix of all files within a table? >> > > > > > > > > > > > It is supposed to be the base table location, which >> > > > > > theoretically >> > > > > > > > > > should >> > > > > > > > > > > > be the common prefix of all files within a table I >> > > believe. >> > > > > > > > > > > > >> > > > > > > > > > > > - What happens when a value does not match these >> > > > > expectations? >> > > > > > > > > > > > Whether it is one value or many is restricted by the >> > spec >> > > > > > > already. >> > > > > > > > > > > > For URI format, I think we can do a format check, >> and >> > > fail >> > > > > it. >> > > > > > > > > > > > Other than that, we will not do any other special >> > check, >> > > > and >> > > > > we >> > > > > > > > rely >> > > > > > > > > on >> > > > > > > > > > > > the client to put the correct value, otherwise, the >> > other >> > > > > > engine >> > > > > > > > will >> > > > > > > > > > > > not be able to successfully read the table. >> > > > > > > > > > > > >> > > > > > > > > > > > For the location keyword, as Eric has pointed out, >> we >> > can >> > > > > > > > potentially >> > > > > > > > > > > have >> > > > > > > > > > > > a reserved key for the properties. However, location >> > is a >> > > > > > common >> > > > > > > > > > > > enough key among various table formats, which >> worths a >> > > > > > dedicated >> > > > > > > > key >> > > > > > > > > to >> > > > > > > > > > > > help store and load the information in a more >> > > > straightforward >> > > > > > > > > > > > way. For things that are specific to one or two >> > > formats, I >> > > > > > think >> > > > > > > > it >> > > > > > > > > > > makes >> > > > > > > > > > > > more sense to use a reserved property key. >> > > > > > > > > > > > >> > > > > > > > > > > > As a reference, in Iceberg, the CreateTable request >> and >> > > > > > > > TableMetadata >> > > > > > > > > > > does >> > > > > > > > > > > > have an explicit location key in the spec. For >> > > > > write.data.path >> > > > > > > > > > > > and write.metadata.path, they are passed as >> properties >> > > > today. >> > > > > > > > > > > > >> > > > > > > > > > > > Best Regards, >> > > > > > > > > > > > Yun >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > On Wed, May 7, 2025 at 3:54 PM Dmitri Bourlatchkov < >> > > > > > > > di...@apache.org >> > > > > > > > > > >> > > > > > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > >> Another point: I'm pretty sure sooner or later >> users >> > > will >> > > > > want >> > > > > > > to >> > > > > > > > > move >> > > > > > > > > > > >> their data to some other location. As an option >> users >> > > may >> > > > > want >> > > > > > > to >> > > > > > > > > > write >> > > > > > > > > > > >> new >> > > > > > > > > > > >> files into another location but keep old files in >> > place. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Also: if the location is a URI, how do we deal >> with s3 >> > > vs. >> > > > > s3a >> > > > > > > for >> > > > > > > > > > > >> example? >> > > > > > > > > > > >> >> > > > > > > > > > > >> In Iceberg it is quite common for different >> engines to >> > > use >> > > > > > > > different >> > > > > > > > > > > >> access >> > > > > > > > > > > >> tools, which often leads to different URI schemes. >> > > > > > > > > > > >> >> > > > > > > > > > > >> Cheers, >> > > > > > > > > > > >> Dmitri. >> > > > > > > > > > > >> >> > > > > > > > > > > >> On Wed, May 7, 2025 at 6:46 PM Eric Maynard < >> > > > > > > > > eric.w.mayn...@gmail.com >> > > > > > > > > > > >> > > > > > > > > > > >> wrote: >> > > > > > > > > > > >> >> > > > > > > > > > > >> > All good questions Dmitri — I’m especially >> > interested >> > > in >> > > > > the >> > > > > > > > first >> > > > > > > > > > one >> > > > > > > > > > > >> as >> > > > > > > > > > > >> > from what I understand Iceberg tables can have >> > > metadata >> > > > > and >> > > > > > > data >> > > > > > > > > at >> > > > > > > > > > > two >> > > > > > > > > > > >> > different paths that we need to vend credentials >> > for. >> > > > > > > > > > > >> > >> > > > > > > > > > > >> > For iceberg tables, we just use special >> properties >> > to >> > > > > track >> > > > > > > > these >> > > > > > > > > > > >> > locations. I wonder if we couldn’t do the same >> for >> > > > generic >> > > > > > > > tables. >> > > > > > > > > > > >> > >> > > > > > > > > > > >> > On Wed, May 7, 2025 at 3:42 PM Dmitri >> Bourlatchkov < >> > > > > > > > > > di...@apache.org> >> > > > > > > > > > > >> > wrote: >> > > > > > > > > > > >> > >> > > > > > > > > > > >> > > Hi Yun, >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > Please clarify the meaning of the value of the >> new >> > > > > > location >> > > > > > > > > > > attribute. >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > - Is is one value or many? >> > > > > > > > > > > >> > > - Is it a URI? >> > > > > > > > > > > >> > > - Does it point to any particular file? >> > > > > > > > > > > >> > > - Is it a common prefix of all files within a >> > table? >> > > > > > > > > > > >> > > - What happens when a value does not match >> these >> > > > > > > expectation? >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > Thanks, >> > > > > > > > > > > >> > > Dmitri. >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > On 2025/05/07 21:50:19 yun zou wrote: >> > > > > > > > > > > >> > > > Hi folks, >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > I would like to propose to add an optional >> > > > `location` >> > > > > > > field >> > > > > > > > to >> > > > > > > > > > > >> > > > CreateGenricTable Request and >> LoadGenericTable >> > > > > response. >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > The `location` is the location for the table, >> > > which >> > > > is >> > > > > > > > common >> > > > > > > > > to >> > > > > > > > > > > >> most >> > > > > > > > > > > >> > > table >> > > > > > > > > > > >> > > > formats including Iceberg, Delta, Hudi, csv, >> > > parquet >> > > > > > etc. >> > > > > > > > The >> > > > > > > > > > > >> location >> > > > > > > > > > > >> > > > information is critical for loading the >> table at >> > > > > engine >> > > > > > > > side, >> > > > > > > > > > > >> having a >> > > > > > > > > > > >> > > > dedicated keyword could help improve the >> > > robustness >> > > > > for >> > > > > > > > cross >> > > > > > > > > > > engine >> > > > > > > > > > > >> > > > sharing, instead of relying on the properties >> > > passed >> > > > > by >> > > > > > > the >> > > > > > > > > > client >> > > > > > > > > > > >> > side. >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > Furthermore, this information is also >> required >> > to >> > > > > > provide >> > > > > > > > > > > credential >> > > > > > > > > > > >> > > > vending capabilities later. >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > Here is the PR for adding the spec: >> > > > > > > > > > > >> > > > https://github.com/apache/polaris/pull/1543 >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > Looking forward to your reply and feedback! >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > > Best Regards, >> > > > > > > > > > > >> > > > Yun >> > > > > > > > > > > >> > > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > >> > > > > > > > > > > >> >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >