Potentially useful further context:
the current C++ FileSystem documentation at
https://arrow.apache.org/docs/cpp/io.html#filesystems

On Wed, Apr 9, 2025 at 9:35 AM Benjamin Kietzman <bengil...@gmail.com>
wrote:

> I have been working on modularizing the C++ library by extending
> FileSystem construction from URIs. I recently merged a PR which prompted
> some discussion [1] of how the library should handle secrets.
>
> Some FileSystems cannot be constructed without one or more secrets. For
> example, an S3FileSystem might require a proxy's username and password in
> order to configure the client which the S3FileSystem wraps. Since the
> usefulness of S3 and other filesystems which may only use default
> credentials is very limited, I think it's safe to say that any interface
> for construction of filesystems must accept secrets as parameters.
>
> In the C++ library and its bindings, FileSystems can be constructed from a
> URI. This modular interface means that libarrow can construct an
> S3FileSystem even without being compiled with/linked to the AWS SDK. Since
> URIs must be complete specifications of a filesystem, this necessitates
> inclusion of the secrets required by S3 in the URI. Since anyone with a URI
> has access to the filesystem to which it refers, these filesystem URIs are
> transitively secret.
>
> This can and should be better documented, but first we should discuss
> whether URIs-which-are-secrets is an acceptable interface. As a minimal
> example of an alternative design, we could extend the FileSystemFactory
> interface, allowing URIs to reference secrets registered by name elsewhere:
> "s3://{my-s3-key}:{my-s3-secret-key}@.../{my-secret-bucket}". (New
> secrets may be added like GetSecretRegistry()->AddSecret({.key =
> "my-s3-secret-key", .secret = "sw0rdf1sh"});)
>
> Is explicit out-of-URI secret management necessary, or is it sufficient to
> document that since filesystem URIs represent access to their referent they
> must be guarded accordingly?
>
> Ben Kietzman
> [1] https://github.com/apache/arrow/pull/41559#discussion_r1768836077
>

Reply via email to