[ https://issues.apache.org/jira/browse/HDFS-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HDFS-3894. ------------------------------- Resolution: Fixed Fix Version/s: QuorumJournalManager (HDFS-3077) Hadoop Flags: Reviewed Committed to branch, thx for review > QJM: testRecoverAfterDoubleFailures can be flaky due to IPC client caching > -------------------------------------------------------------------------- > > Key: HDFS-3894 > URL: https://issues.apache.org/jira/browse/HDFS-3894 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: test > Affects Versions: QuorumJournalManager (HDFS-3077) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: QuorumJournalManager (HDFS-3077) > > Attachments: hdfs-3894.txt > > > TestQJMWithFaults.testRecoverAfterDoubleFailures fails really occasionally. > Looking into it, the issue seems to be that it's possible by random chance > for an IPC server port to be reused between two different iterations of the > test loop. The client will then pick up and re-use the existing IPC > connection to the old server. However, the old server was shut down and > restarted, so the old IPC connection is stale (ie disconnected). This causes > the new client to get an EOF when it sends the "format()" call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira