Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-04 Thread Yufei Gu
> If the same endpoint works for both the engine and the Polaris Server, it is only necessary to set one "endpoint" parameter. That's right, we will only need one endpoint in that case. However, if there is no one endpoint can work for both engines and Polaris server, the extra new endpoint makes

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-02 Thread Alexandre Dutra
Hi all, I agree with Dmitri: having this feature in Polaris will be very helpful. We know of many users that deploy engines and catalog in different networks, and thus must access the storage layer through different addresses. This feature is easy to implement, enables new use cases, and thus inc

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-01 Thread Eric Maynard
> It is only relevant for on-prem S3-compatible storage. I imagine, "endpointInternal" will never be needed for AWS storage. That's not true, is it? If we are imagining scenarios where the client and the server are on totally different networks, the regular endpoint could indeed need to be address

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-01 Thread Dmitri Bourlatchkov
Hi Yufei, > I think Polaris server will only need the internal endpoint in that case, > while engines could use the public endpoint. Do we need to configure both > for the Polaris server Polaris puts "s3.endpoint" into loadTable responses when credential vending is enabled. So, yes, both settings

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-01 Thread Dmitri Bourlatchkov
I'm adding the feature based on some prior private experience, so from my POV it is not contrived, although I cannot really go into the details of that experience :) It is only relevant for on-prem S3-compatible storage. I imagine, "endpointInternal" will never be needed for AWS storage. Cheers,

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-01 Thread Yufei Gu
> > 1: I do not really know. This is a question about a specific deployment > environment. > If the endpoint used by engines could be also used by the Polaris server, we should just use it, instead of configuring another one. > 2: I'm not sure I understand your question. Two endpoints are necessa

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-08-01 Thread Eric Maynard
> Two endpoints are necessary in cases when the server's view of the network is different from the engine's view. That could be true -- but at present it seems like a bit of a contrived scenario. It's also not exclusive to any particular cloud right? On Fri, Aug 1, 2025 at 7:47 AM Dmitri Bourlatc

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-31 Thread Dmitri Bourlatchkov
Hi Yufei, 1: I do not really know. This is a question about a specific deployment environment. 2: I'm not sure I understand your question. Two endpoints are necessary in cases when the server's view of the network is different from the engine's view. Cheers, Dmitri. On Thu, Jul 31, 2025 at 5:

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-31 Thread Yufei Gu
Thanks for the explanation. Two questions: 1. Should the public endpoint used by engines still work with Polaris even if it co-locates with MinIO server? 2. Can we set Polaris endpoint directly to the internal address in that case? Another way to ask this question is that why do we need to keep bot

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-31 Thread Dmitri Bourlatchkov
Hi Yufei, The "how" in your question depends on the deployment environment, I guess. There are a lot of variants. If you wonder whether such a situation is possible in practice, I believe it is. An example would be self-hosting non-AWS S3 storage and Polaris in a way that Polaris connections go t

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-31 Thread Yufei Gu
Hi Dimtri, That generally makes sense to me. For awareness, could you elaborate a bit on how the Polaris server and query engines (like Spark, Trino, etc.) might access the same object storage (e.g., MinIO) via different DNS endpoints? Yufei On Thu, Jul 31, 2025 at 4:36 AM Alexandre Dutra wrot

Re: [DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-31 Thread Alexandre Dutra
Hi Dmitri, I think your suggestion makes sense. We added something similar in Nessie long ago, and it is definitely useful. I left some comments in the PR. Thanks, Alex On Thu, Jul 31, 2025 at 4:12 AM Dmitri Bourlatchkov wrote: > > Hi All, > > I propose to add an `endpointInternal` optional pa

[DISCUSS] Adding endpointInternal to AwsStorageConfigInfo

2025-07-30 Thread Dmitri Bourlatchkov
Hi All, I propose to add an `endpointInternal` optional parameter to AwsStorageConfigInfo in PR [2213]. The main idea is to support deployment edge cases where Polaris Servers may 'see' storage under a different DNS name than query engines. This use case applies mostly to non-AWS S3 storage (e.g.