Daniel Templeton created HDFS-13913:
---------------------------------------

             Summary: LazyPersistFileScrubber.run() error handling is poor
                 Key: HDFS-13913
                 URL: https://issues.apache.org/jira/browse/HDFS-13913
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: namenode
    Affects Versions: 3.1.0
            Reporter: Daniel Templeton
            Assignee: Daniel Green


In {{LazyPersistFileScrubber.run()}} we have:

{code}
        try {
          clearCorruptLazyPersistFiles();
        } catch (Exception e) {
          FSNamesystem.LOG.error(
              "Ignoring exception in LazyPersistFileScrubber:", e);
        }
{code}

First problem is that catching {{Exception}} is sloppy.  It should instead be a 
multicatch for the actual exceptions thrown or better a set of separate catch 
statements that react appropriately to the type of exception.

Second problem is that it's bad to log an ERROR that's not actionable and that 
can be safely ignored.  The log message should be logged at WARN or INFO level.

Third, the log message is useless.  If it's going to be a WARN or ERROR, a log 
message should be actionable.  Otherwise it's an info.  A log message should 
contain enough information for an admin to understand what it means.

In the end, I think the right thing here is to leave the high-level behavior 
unchanged: log a message and ignore the error, hoping that the next run will go 
better.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to