Daniel Templeton created HDFS-13913: ---------------------------------------
Summary: LazyPersistFileScrubber.run() error handling is poor Key: HDFS-13913 URL: https://issues.apache.org/jira/browse/HDFS-13913 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 3.1.0 Reporter: Daniel Templeton Assignee: Daniel Green In {{LazyPersistFileScrubber.run()}} we have: {code} try { clearCorruptLazyPersistFiles(); } catch (Exception e) { FSNamesystem.LOG.error( "Ignoring exception in LazyPersistFileScrubber:", e); } {code} First problem is that catching {{Exception}} is sloppy. It should instead be a multicatch for the actual exceptions thrown or better a set of separate catch statements that react appropriately to the type of exception. Second problem is that it's bad to log an ERROR that's not actionable and that can be safely ignored. The log message should be logged at WARN or INFO level. Third, the log message is useless. If it's going to be a WARN or ERROR, a log message should be actionable. Otherwise it's an info. A log message should contain enough information for an admin to understand what it means. In the end, I think the right thing here is to leave the high-level behavior unchanged: log a message and ignore the error, hoping that the next run will go better. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org