Hi Peter, > would this cause issues for the users?
I think yes, it is going to make trouble for users who want to use S3 without HDFS client. Adding HDFS client may happen but enforcing it is not a good direction. As mentioned I've realized that we have 6 different ways how Hadoop conf is loaded but not sure one can make one generic from it. Sometimes one need HdfsConfiguration or YarnConfiguration instances which is hard to generalize. What I can imagine is the following (but super time consuming): * One creates specific configuration instance in the connector (HdfsConfiguration, YarnConfiguration) * Casting it to Configuration instance * Calling a generic loadConfiguration(Configuration conf, List<String> filesToLoad) * Use locations which are covered in HadoopUtils.getHadoopConfiguration (except the deprecated ones) * Use this function on all the places around Flink In filesToLoad one could specify core-site.xml, hdfs-site.xml etc. Never tried it out but this idea is in my head for quite some time... BR, G On Tue, Oct 25, 2022 at 11:43 AM Péter Váry <peter.vary.apa...@gmail.com> wrote: > Hi Team, > > I have recently faced the issue that the S3 FileSystem read my > core-site.xml until it was on the classpath, but later when I tried to add > it using the HADOOP_CONF_DIR then the configuration file was not loaded. > Filed a jira [1] and created a PR [2] for fixing it. > > HadoopUtils.getHadoopConfiguration is the method which considers all the > relevant configurations for accessing / loading the hadoop configuration > files, so I used it to fix the issue. The downside is that in this method > we instantiate the HdfsConfiguration object which requires me to add the > hadoop-hdfs-client as a provided dependency. > > My question for the more experienced folks - would this cause issues for > the users? Could we assume that if the hadoop-common is on the classpath > then hadoop-hdfs-client is on the classpath as well? Do you see other > possible drawbacks or issues with my approach? > > Thanks, > Peter > > [1] https://issues.apache.org/jira/browse/FLINK-29754 > [2] https://github.com/apache/flink/pull/21148 >