Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

Till Rohrmann Fri, 14 Oct 2016 02:39:06 -0700

Fabian's proposal sounds good to me. It would be a good first step towards
removing our dependency on Hadoop.


Thus, +1 for the changes.

Cheers,
Till

On Fri, Oct 14, 2016 at 11:29 AM, Fabian Hueske <[email protected]> wrote:

> Hi everybody,
>
> I would like to propose to deprecate the utility methods to read data with
> Hadoop InputFormats from the (batch) ExecutionEnvironment.
>
> The motivation for deprecating these methods is reduce Flink's dependency
> on Hadoop but rather have Hadoop as an optional dependency for users that
> actually need it (HDFS, MapRed-Compat, ...). Eventually, we want to have
> Flink distribution that does not have a hard Hadoop dependency.
>
> One step for this is to remove the Hadoop dependency from flink-java
> (Flink's Java DataSet API) which is currently required due to the above
> utility methods (see FLINK-4315). We recently received a PR that addresses
> FLINK-4315 and removes the Hadoop methods from the ExecutionEnvironment.
> After some discussion, it was decided to defer the PR to Flink 2.0 because
> it breaks the API (these methods are delared @PublicEvolving).
>
> I propose to accept this PR for Flink 1.2, but instead of removing the
> methods deprecating them.
> This would help to migrate old code and prevent new usage of these methods.
> For a later Flink release (1.3 or 2.0) we could remove these methods and
> the Hadoop dependency on flink-java.
>
> What do others think?
>
> Best, Fabian
>

Re: [DISCUSS] Deprecate Hadoop source method from (batch) ExecutionEnvironment

Reply via email to