Hello Team,

I am one of many people responsible for supporting the Hadoop products out
in the field.  Error handling and logging are crucial to my success.  I've
been reading over the code and I see many of the same mistakes again and
again.  I just wanted to bring some of these things to your attention so
that moving forward, we can make these products better.

The general best-practice is:

public class TestExceptionLogging
{
  private static final Logger LOG =
LoggerFactory.getLogger(TestExceptionLogging.class);

  public static void main(String[] args) {
    try {
      processData();
    } catch (Exception e) {
      LOG.error("Application failed", e);
    }
  }

  public static void processData() throws Exception {
    try {
      readData();
    } catch (Exception e) {
      throw new Exception("Failed to process data", e);
    }
  }

  public static byte[] readData() throws Exception {
    throw new IOException("Failed to read device");
  }
}

Produces:

[main] ERROR TestExceptionLogging - Application failed
java.lang.Exception: Failed to process data
at TestExceptionLogging.processData(TestExceptionLogging.java:22)
at TestExceptionLogging.main(TestExceptionLogging.java:12)
Caused by: java.io.IOException: Failed to read device
at TestExceptionLogging.readData(TestExceptionLogging.java:27)
at TestExceptionLogging.processData(TestExceptionLogging.java:20)
... 1 more



Please notice that when an exception is thrown, and caught, it is wrapped
at each level and each level adds some more context to describe what was
happening when the error occurred.  It also produces a complete stack trace
that pinpoints the issue.  For Hive folks, it is rarely the case that a
method consuming a HMS API call should itself throw a MetaException.  The
MetaException has no way of wrapping an underlying Exception and helpful
data is often loss.  A method may chooses to wrap a MetaException, but it
should not be throwing them around as the default behavior.

Also important to note is that there is exactly one place that is doing the
logging.  There does not need to be any logging at the lower levels.  A
catch block should throw or log, not both.  This is an anti-pattern and
annoying as the end user: having to deal with multiple stack traces at
different log levels for the same error condition.  The log message should
be at the highest level only.

https://community.oracle.com/docs/DOC-983543#logAndThrow

Both projects use SLF4J as the logging framework (facade anyway).  Please
familiarize yourself with how to correctly log an Exception.  There is no
need to log a thread name, a time stamp, a class name, or a stack trace.
The logging framework will do that all for you.

http://www.slf4j.org/faq.html#paramException

Again, there is no need to 'stringify' an exception. For example, do not
use this:

https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/StringUtils.java#L86


If you do however want to dump a stack trace for debugging (or trace)
purposes, consider performing the following:

if (LOG.isDebugEnabled()) {
  LOG.debug("Dump Thread Stack", new Exception("Thread Stack Trace (Not an
Error)"));
}

Finally, I've seen it a couple of times in Apache project that enabling
debug-level logging causes the application to emit logs at other levels,
for example:

LOG.warn("Some error occurred: {}", e.getMessage());
if (LOG.isDebugEnabled()) {
  LOG. warn("Dump Warning Thread Stack", e);
}

Please refrain from doing this.  The inner log statement should be at DEBUG
level to match the check.  Otherwise, when I enable DEBUG logging in the
application, the expectation that I have is that I will have the exact
logging as the INFO level, but I will also have additional DEBUG details as
well.  I am going to be using 'grep' to find DEBUG and I will miss this
additional logging.  I will then be peeved when I say "ah ha! It's a WARN
log message, I don't need DEBUG enabled to get this logging when the issue
happens again!"... but then I restart the application and discover I do in
fact need DEBUG logging enabled.

Finally, finally, do not pass server-generated stack traces to a client
application.  When a client dumps a stack trace, it is not always trivial
to decide if the stack trace is in regards to the client or the server.
This can also be a security issue as it may hint to the client which
libraries and (based on line numbers) which versions of the libraries are
being used.  If the server is going to generate an error in response to a
client API call, the server should log the exception (right before
returning the response to the client) and it should only pass a helpful
error message to the client. The operator/administrator can look in the
server logs for the details.

Thanks!

Reply via email to