[ https://issues.apache.org/jira/browse/FLINK-33694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791670#comment-17791670 ]
Patrick Lucas commented on FLINK-33694: --------------------------------------- [~martijnvisser] aye, have a PR open but Azure needs a re-run because I missed a license header. I'm not sure if flinkbot listens to me for commands. > GCS filesystem does not respect gs.storage.root.url config option > ----------------------------------------------------------------- > > Key: FLINK-33694 > URL: https://issues.apache.org/jira/browse/FLINK-33694 > Project: Flink > Issue Type: Bug > Components: FileSystems > Affects Versions: 1.18.0, 1.17.2 > Reporter: Patrick Lucas > Priority: Major > Labels: gcs, pull-request-available > > The GCS FileSystem's RecoverableWriter implementation uses the GCS SDK > directly rather than going through Hadoop. While support has been added to > configure credentials correctly based on the standard Hadoop implementation > configuration, no other options are passed through to the underlying client. > Because this only affects the RecoverableWriter-related codepaths, it can > result in very surprising differing behavior whether the FileSystem is being > used as a source or a sink—while a {{{}gs://{}}}-URI FileSource may work > fine, a {{{}gs://{}}}-URI FileSink may not work at all. > We use [fake-gcs-server|https://github.com/fsouza/fake-gcs-server] in > testing, and so we override the Hadoop GCS FileSystem config option > {{{}gs.storage.root.url{}}}. However, because this option is not considered > when creating the GCS client for the RecoverableWriter codepath, in a > FileSink the GCS FileSystem attempts to write to the real GCS service rather > than fake-gcs-server. At the same time, a FileSource works as expected, > reading from fake-gcs-server. > The fix should be fairly straightforward, reading the {{gs.storage.root.url}} > config option from the Hadoop FileSystem config in > [{{GSFileSystemOptions}}|https://github.com/apache/flink/blob/release-1.18.0/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/GSFileSystemOptions.java#L30] > and, if set, passing it to {{storageOptionsBuilder}} in > [{{GSFileSystemFactory}}|https://github.com/apache/flink/blob/release-1.18.0/flink-filesystems/flink-gs-fs-hadoop/src/main/java/org/apache/flink/fs/gs/GSFileSystemFactory.java]. > The only workaround for this is to build a custom flink-gs-fs-hadoop JAR with > a patch and use it as a plugin. -- This message was sent by Atlassian Jira (v8.20.10#820010)