Fabian's proposal sounds good to me. It would be a good first step towards removing our dependency on Hadoop.
Thus, +1 for the changes. Cheers, Till On Fri, Oct 14, 2016 at 11:29 AM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi everybody, > > I would like to propose to deprecate the utility methods to read data with > Hadoop InputFormats from the (batch) ExecutionEnvironment. > > The motivation for deprecating these methods is reduce Flink's dependency > on Hadoop but rather have Hadoop as an optional dependency for users that > actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to have > Flink distribution that does not have a hard Hadoop dependency. > > One step for this is to remove the Hadoop dependency from flink-java > (Flink's Java DataSet API) which is currently required due to the above > utility methods (see FLINK-4315). We recently received a PR that addresses > FLINK-4315 and removes the Hadoop methods from the ExecutionEnvironment. > After some discussion, it was decided to defer the PR to Flink 2.0 because > it breaks the API (these methods are delared @PublicEvolving). > > I propose to accept this PR for Flink 1.2, but instead of removing the > methods deprecating them. > This would help to migrate old code and prevent new usage of these methods. > For a later Flink release (1.3 or 2.0) we could remove these methods and > the Hadoop dependency on flink-java. > > What do others think? > > Best, Fabian >