LambdaTestUtils added l-expression based intercept() in HADOOP-13716 in
October 2016.
Five Years Ago. That was still java-7...we added it knowing what java 8
would bring.

There is no way we could go back on not using intercept() in tests.

Since then some other big l-expression stuff I've been involved in include
the org.apache.hadoop.fs.s3a.Invoker which lets you executre a remote
operation with conversion of AWS SDK exceptions into IOEs, and a retry
policy based off those IOEs.

    final String region = invoker.retry("getBucketLocation()", bucketName,
true,
        () -> s3.getBucketLocation(bucketName));

That was HADOOP-13786, S3A committers: 2017.  Four years ago. Which was
after branch-3 was java 8 only.

More recently, if you look closely, the whole
org.apache.hadoop.util.functional package is designed to give us basic
Functional Programming around IOE-raising code, including our remote
iterators, giving us a minimal *and tested* set of transformations we can
do with our code.

  public RemoteIterator<S3ALocatedFileStatus>
createLocatedFileStatusIterator(
      RemoteIterator<S3AFileStatus> statusIterator) {
    return RemoteIterators.mappingRemoteIterator(
        statusIterator,
        listingOperationCallbacks::toLocatedFileStatus);
  }

This ties in nicely with the duration tracking/IOStatistics code it came in
(HADOOP-17450), so I can evaluate an operation and collect min/mean/max
durations of operations, not just log but serialize into the task/job
summary files and so get some details on where the bottlenecks are in
talking to cloud services.

final RemoteIterator<FileStatus> listing =
    trackDuration(iostatistics, OP_DIRECTORY_SCAN, () ->
        operations.listStatusIterator(srcDir));


So I'm afraid that I will be carrying on using L-expressions, such as in
HADOOP-17511. But I don't expect any of the code there to be backportable
to Java 7(*)

At the same time, I'd like to know what the performance impact of us using
l-expressions is in terms of cost of allocations of closures, evaluation
etc. There's also the *little* detail that stack trace data doesn't get
preserved that well. Together that argues against gratuitous use of java
streams.


To summarise my PoV then

Java 8 lambda expressions are an incredible tool which can be used in
interesting and innovative ways. Adding retryability, stats gathering and
auditing of remote IO being the key ones I've been using it for, in the
Hadoop codebase, for 4-5 years.

I'm happy to let someone lay out a style guide on good/bad uses, a "no
gratuitous move to streams()" policy, and may be a designated "No Lambda's
here" bit of code. (UGI?)

But a discussion about whether to have them in the code at all? Not only
too late, I don't see how that can be justified.

-Steve

(*) . Having recently been backporting some ABFS code to a branch-3.1 fork,
Mockito version downgrading is enough of a blocker on test cases there
that the language version is a detail...you won't get that far.

Reply via email to