Rushabh S Shah created HDFS-11804: ------------------------------------- Summary: KMS client needs retry logic Key: HDFS-11804 URL: https://issues.apache.org/jira/browse/HDFS-11804 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.6.0 Reporter: Rushabh S Shah Assignee: Rushabh S Shah
The kms client appears to have no retry logic – at all. It's completely decoupled from the ipc retry logic. This has major impacts if the KMS is unreachable for any reason, including but limited to network connection issues, timeouts, the +restart during an upgrade+. This has some major ramifications: # Jobs may fail to submit, although oozie resubmit logic should mask it # Non-oozie launchers may experience higher rates if they do not already have retry logic. # Tasks reading EZ files will fail, probably be masked by framework reattempts # EZ file creation fails after creating a 0-length file – client receives EDEK in the create response, then fails when decrypting the EDEK # Bulk hadoop fs copies, and maybe distcp, will prematurely fail -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org