Thanks for starting the discussion Steve. This is a prickly issue and unfortunately we are hostages of past decisions. Thanks a lot for attacking the problem in the first place and sticking with it.
In my experience we have found a lot of places that AWS secrets were logged for everyone to see. I'm not sure allowing people to do that is the right thing to do in the long-term. We have to bite the bullet sometime. Perhaps we should do that in trunk (3.0.0)? To unbreak clients of Hadoop-2.x we can go with Vinayakumar's proposal but only in branch-2. Ofcourse technically we have hadoop-2.8.0 already out with this, but I agree we can put the fix in 2.8.2. $0.02 Ravi On Wed, Aug 2, 2017 at 5:52 AM, Steve Loughran <ste...@hortonworks.com> wrote: > > > HADOOP-3733<https://issues.apache.org/jira/browse/HADOOP-3733> stripped > out the user:password secret from the s3., s3a, s3n URLs for security > grounds: everything logged Path entries without ever considering that they > contained secret credentials. > > but that turns out to break things, as noted in HADOOP-14439 ...you can't > any more go Path -> String -> Path without authentication details being > lost, and of course, guess how paths are often marshalled around? As > strings (after all, they weren't serializable until recently) > > Vinayakumar has proposed a patch reinstating retaining the secrets, at > least enough for distcp > > https://issues.apache.org/jira/browse/HADOOP-3733?focusedCom > mentId=16110297&page=com.atlassian.jira.plugin.system. > issuetabpanels:comment-tabpanel#comment-16110297 > > I think I'm going to go with this, once I get the tests & testing to go > with, and if its enough to work with spark too .. targeting 2.8.2 if its > not too late. > > If there's a risk, it's that if someone puts secrets into s3 URIs, the > secrets are more likely to be logged. But even with the current code, > there's no way to guarantee that the secrets will never be logged. The > danger comes from having id:secret credentials in the URI —something people > will be told off for doing. > > >