Potentially useful further context: the current C++ FileSystem documentation at https://arrow.apache.org/docs/cpp/io.html#filesystems
On Wed, Apr 9, 2025 at 9:35 AM Benjamin Kietzman <bengil...@gmail.com> wrote: > I have been working on modularizing the C++ library by extending > FileSystem construction from URIs. I recently merged a PR which prompted > some discussion [1] of how the library should handle secrets. > > Some FileSystems cannot be constructed without one or more secrets. For > example, an S3FileSystem might require a proxy's username and password in > order to configure the client which the S3FileSystem wraps. Since the > usefulness of S3 and other filesystems which may only use default > credentials is very limited, I think it's safe to say that any interface > for construction of filesystems must accept secrets as parameters. > > In the C++ library and its bindings, FileSystems can be constructed from a > URI. This modular interface means that libarrow can construct an > S3FileSystem even without being compiled with/linked to the AWS SDK. Since > URIs must be complete specifications of a filesystem, this necessitates > inclusion of the secrets required by S3 in the URI. Since anyone with a URI > has access to the filesystem to which it refers, these filesystem URIs are > transitively secret. > > This can and should be better documented, but first we should discuss > whether URIs-which-are-secrets is an acceptable interface. As a minimal > example of an alternative design, we could extend the FileSystemFactory > interface, allowing URIs to reference secrets registered by name elsewhere: > "s3://{my-s3-key}:{my-s3-secret-key}@.../{my-secret-bucket}". (New > secrets may be added like GetSecretRegistry()->AddSecret({.key = > "my-s3-secret-key", .secret = "sw0rdf1sh"});) > > Is explicit out-of-URI secret management necessary, or is it sufficient to > document that since filesystem URIs represent access to their referent they > must be guarded accordingly? > > Ben Kietzman > [1] https://github.com/apache/arrow/pull/41559#discussion_r1768836077 >