Bharat Viswanadham created HDDS-1476: ----------------------------------------
Summary: Fix logIfNeeded logic in EndPointStateMachine Key: HDDS-1476 URL: https://issues.apache.org/jira/browse/HDDS-1476 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham {code:java} public void E(Exception ex) { LOG.trace("Incrementing the Missed count. Ex : {}", ex); this.incMissed(); if (this.getMissedCount() % getLogWarnInterval(conf) == 0) { LOG.error( "Unable to communicate to SCM server at {} for past {} seconds.", this.getAddress().getHostString() + ":" + this.getAddress().getPort(), TimeUnit.MILLISECONDS.toSeconds( this.getMissedCount() * getScmHeartbeatInterval(this.conf)), ex); } }{code} This method will be called when any exception occur in stateMachine to log an exception. But to not log aggresively we have this ozone.scm.heartbeat.log.warn.interval.count property to control logging. There is a small issue here, we don't log the exception first time when it occurred. So, we need to log for the first time and then increment the missingCount. Fix is to move the this.incMissed() to end of the method so that we log it for the first time exception occurred and after that every log.warn.interval.count exceptions happened. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org