GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/4780
[FLINK-7766] [FLINK-7767] [file system sink] Cleanups in the streaming FS sinks **This build on top of #4776 - only the last two commits are relevant.** ## What is the purpose of the change This change gets rid of some legacy code and improves Hadoop configuration access. ## Brief change log - Drop obsolete reflective `hflush()` calls. This was done reflectively before for Hadoop 1 compatibility. Since Hadoop 1 is no longer supported, this is obsolete now. - Avoid loading Hadoop conf dynamically at runtime. That way we guarantee consistent FS config use. ## Verifying this change This change is already covered by the existing tests for the `BucketingSink` and `RollingSink`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no)** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no)** - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no)** - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink fs_cleanups Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4780.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4780 ---- commit 3b786844dd9c0ce176eac98c8a05ebe50cb1ebe7 Author: Stephan Ewen <se...@apache.org> Date: 2017-10-02T12:34:27Z [FLINK-7643] [core] Misc. cleanups in FileSystem - Simplify access to local file system - Use a fair lock for all FileSystem.get() operations - Robust falback to local fs for default scheme (avoids URI parsing error on Windows) - Deprecate 'getDefaultBlockSize()' - Deprecate create(...) with block sizes and replication factor, which is not applicable to many FS commit a5ef09bb601cdd77fcb94e9ce633fdf979031aaf Author: Stephan Ewen <se...@apache.org> Date: 2017-10-02T14:30:07Z [FLINK-7643] [core] Drop eager checks for file system support. Some places validate if the file URIs are resolvable on the client. This leads to problems when file systems are not accessible from the client, when the full libraries for the file systems are not present on the client (for example often the case in cloud setups), or when the configuration on the client is different from the nodes/containers that will execute the application. commit 536675b03a5050fda9c3e1fd403818cb50dcc6ff Author: Stephan Ewen <se...@apache.org> Date: 2017-10-02T14:25:18Z [FLINK-7643] [core] Rework FileSystem loading to use factories This makes sure that configurations are loaded once and file system instances are properly reused by scheme and authority. This also factors out a lot of the special treatment of Hadoop file systems and simply makes the Hadoop File System factory the default fallback factory. commit b72eddfcd37b5a59908ebd536cd92b7caf2ae16e Author: Stephan Ewen <se...@apache.org> Date: 2017-10-05T09:26:13Z [FLINK-7767] [file system sinks] Avoid loading Hadoop conf dynamically at runtime commit 7843c2ffb44f99967dc71746ac1c79b04a74fe80 Author: Stephan Ewen <se...@apache.org> Date: 2017-10-05T09:27:48Z [FLINK-7766] [file system sink] Drop obsolete reflective hflush calls This was done reflectively before for Hadoop 1 compatibility. Since Hadoop 1 is no longer supported, this is obsolete now. ---- ---