Dear GDAL Developers,
I hope this message finds you well. I am experiencing an issue with GDAL’s VSICURL virtual file system where the Authorization header is being passed to redirected pre-signed S3 URLs, leading to errors when accessing AWS S3. *Background:* In my use case, I have a proxy URL that, when called with an Authorization header, validates the token, verifies authorization, and then generates a pre-signed URL to an S3 object. The proxy then redirects the client to this pre-signed URL. When using GDAL with this reverse proxy server and setting the configuration option --config CPL_VSIL_CURL_USE_S3_REDIRECT NO, everything works correctly. However, the issue is that GDAL makes too many round trips, and each time I end up generating a new pre-signed URL. Ideally, I would like GDAL to send the initial request and then reuse the pre-signed URLs for subsequent requests. This desired behavior is achieved with --config CPL_VSIL_CURL_USE_S3_REDIRECT YES. However, the problem now is that GDAL, on subsequent requests to the pre-signed URL, is also passing the Authorization header. I am using the configuration --config GDAL_HTTP_HEADERS "Authorization: Bearer xxxx" to set the initial header, so the Authorization header is being sent with every request. When the Authorization header is sent to AWS S3 along with the pre-signed URL, AWS returns the following error: ```Only one auth mechanism allowed; only the X-Amz-Algorithm query parameter, Signature query string parameter or the Authorization header should be specified.``` *Observed Behavior:* • *First Request (302 OK):* GDAL sends a request to the proxy URL with the Authorization header. The proxy validates the token and redirects to the pre-signed S3 URL. GDAL follows the redirect, and since the pre-signed URL is accessed without the Authorization header, AWS S3 responds with a *200 OK*. This behavior is as expected. • *Subsequent Requests (400 Bad Request):* GDAL reuses the pre-signed URL but includes the Authorization header in the request. AWS S3, seeing both the pre-signed URL’s query parameters and the Authorization header, returns a *400 Bad Request* error, stating that only one authentication mechanism is allowed. *Expected Behavior:* I expect that GDAL’s VSICURL should send the initial request with the Authorization header to the proxy URL. Upon receiving the redirect to the pre-signed URL, it should not include the Authorization header in subsequent requests to AWS S3. This would allow AWS S3 to accept the pre-signed URL without conflicts. *Questions:* • Is there a way to configure GDAL so that it does not pass the Authorization header to the redirected pre-signed URLs while retaining it for the initial request? • If this feature is not currently available, would it be feasible to implement such functionality? • Are there any plans to address the handling of HTTP redirect response codes 301 and 307 in future GDAL releases to better support this use case? *GDAL CLI Command:* *``` *gdalinfo --debug on --config CPL_CURL_VERBOSE YES --config GDAL_DISABLE_READDIR_ON_OPEN EMPTY_DIR --config CPL_VSIL_CURL_USE_S3_REDIRECT YES --config GDAL_HTTP_HEADERS="Authorization: Bearer xxxx" /vsicurl/ https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif ``` *Example GDAL Logs:* Below are the GDAL logs illustrating the issue (sensitive information has been redacted): *First Request (200 OK):* *```* HTTP: libcurl/8.7.1 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.61.0 HTTP: GDAL was built against curl 8.4.0, but is running against 8.7.1. CURL_INFO_TEXT: [HTTP/2] [1] OPENED stream for https://example.com/stac/collections/items/assets?path=s3://your-bucket/red.tif CURL_INFO_TEXT: [HTTP/2] [1] [:method: HEAD] CURL_INFO_TEXT: [HTTP/2] [1] [:scheme: https] CURL_INFO_TEXT: [HTTP/2] [1] [:authority: example.com] CURL_INFO_TEXT: [HTTP/2] [1] [:path: /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif] CURL_INFO_TEXT: [HTTP/2] [1] [user-agent: GDAL/3.9.2] CURL_INFO_TEXT: [HTTP/2] [1] [accept: */*] CURL_INFO_TEXT: [HTTP/2] [1] [authorization: Bearer [REDACTED]] CURL_INFO_HEADER_OUT: HEAD /stac/collections/items/assets/foo?path=s3://your-bucket/red.tif HTTP/2 Host: example.com User-Agent: GDAL/3.9.2 Accept: */* Authorization: Bearer [REDACTED] CURL_INFO_TEXT: Request completely sent off CURL_INFO_HEADER_IN: HTTP/2 301 CURL_INFO_HEADER_IN: date: Sun, 22 Sep 2024 04:44:28 GMT CURL_INFO_HEADER_IN: content-type: text/plain; charset=utf-8 CURL_INFO_HEADER_IN: content-length: 43 CURL_INFO_HEADER_IN: location: https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 CURL_INFO_HEADER_IN: x-content-length: 176995703 CURL_INFO_HEADER_IN: apigw-requestid: [REDACTED] CURL_INFO_HEADER_IN: CURL_INFO_TEXT: Ignoring the response-body CURL_INFO_TEXT: Connection #0 to host example.com left intact CURL_INFO_TEXT: Issue another request to this URL: ' https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 ' CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc file; using defaults CURL_INFO_TEXT: Host s3-bucket-placeholder:443 was resolved. CURL_INFO_TEXT: Trying [REDACTED]... CURL_INFO_TEXT: Connected to s3-bucket-placeholder ([REDACTED]) port 443 CURL_INFO_TEXT: SSL connection using TLSv1.3 / [REDACTED] CURL_INFO_TEXT: Server certificate: CURL_INFO_TEXT: subject: CN=*.s3.amazonaws.com CURL_INFO_TEXT: start date: Apr 22 00:00:00 2024 GMT CURL_INFO_TEXT: expire date: Apr 7 23:59:59 2025 GMT CURL_INFO_TEXT: issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M01 CURL_INFO_TEXT: SSL certificate verify ok. CURL_INFO_HEADER_OUT: HEAD /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 HTTP/1.1 Host: s3-bucket-placeholder User-Agent: GDAL/3.9.2 Accept: */* CURL_INFO_TEXT: Request completely sent off CURL_INFO_HEADER_IN: HTTP/1.1 200 OK CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED] CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED] CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:29 GMT CURL_INFO_HEADER_IN: Last-Modified: Tue, 17 Sep 2024 17:44:40 GMT CURL_INFO_HEADER_IN: ETag: "[REDACTED]" CURL_INFO_HEADER_IN: x-amz-server-side-encryption: AES256 CURL_INFO_HEADER_IN: Accept-Ranges: bytes CURL_INFO_HEADER_IN: Content-Type: image/tiff CURL_INFO_HEADER_IN: Server: AmazonS3 CURL_INFO_HEADER_IN: Content-Length: 176995703 CURL_INFO_HEADER_IN: CURL_INFO_TEXT: Connection #1 to host s3-bucket-placeholder left intact VSICURL: Effective URL: https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 VSICURL: Will use redirect URL for the next 3599 seconds VSICURL: GetFileSize( https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif)=176995703 response_code=200 ``` *Subsequent Request (400 Bad Request):* *```* VSICURL: Using redirect URL as it looks to be still valid (3599 seconds left) VSICURL: Downloading 0-16383 ( https://s3-bucket-placeholder/red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868). .. CURL_INFO_TEXT: Couldn't find host s3-bucket-placeholder in the .netrc file; using defaults CURL_INFO_TEXT: Found bundle for host: 0x600001aa8270 [serially] CURL_INFO_TEXT: Can not multiplex, even if we wanted to CURL_INFO_TEXT: Re-using existing connection with host s3-bucket-placeholder CURL_INFO_HEADER_OUT: GET /red.tif?response-content-type=image%2Ftiff&AWSAccessKeyId=[REDACTED]&Signature=[REDACTED]&x-amz-security-token=[REDACTED]&Expires=1726983868 HTTP/1.1 Host: s3-bucket-placeholder User-Agent: GDAL/3.9.2 Accept: */* Authorization: Bearer [REDACTED] Range: bytes=0-16383 CURL_INFO_TEXT: Request completely sent off CURL_INFO_HEADER_IN: HTTP/1.1 400 Bad Request CURL_INFO_HEADER_IN: x-amz-request-id: [REDACTED] CURL_INFO_HEADER_IN: x-amz-id-2: [REDACTED] CURL_INFO_HEADER_IN: Content-Type: application/xml CURL_INFO_HEADER_IN: Transfer-Encoding: chunked CURL_INFO_HEADER_IN: Date: Sun, 22 Sep 2024 04:44:28 GMT CURL_INFO_HEADER_IN: Server: AmazonS3 CURL_INFO_HEADER_IN: Connection: close CURL_INFO_HEADER_IN: CURL_INFO_TEXT: Closing connection VSICURL: Got response_code=400 ERROR 4: `/vsicurl/ https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif' not recognized as being in a supported file format. gdalinfo failed - unable to open '/vsicurl/ https://example.com/stac/collections/items/assets/foo?path=s3://your-bucket/red.tif '. ``` *Additional Information:* If you require any further details or clarification, please let me know. I would be happy to provide more information. Additionally, if necessary, I can create an issue on the GDAL GitHub repository to track this problem. Thank you very much for your time and assistance. I appreciate any guidance or suggestions you can provide to help resolve this issue. Best Regards, Pradeep Gulla
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev