[ https://issues.apache.org/jira/browse/HIVE-16686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013179#comment-16013179 ]
Sushanth Sowmyan commented on HIVE-16686: ----------------------------------------- This introduces parsing of additional parameters that are not directly used by hive, but are passed on to distcp when hive invokes it. We now introduce the ability to use the hive command to do "set" commands to pass along cli arguments to distcp. Any parameter set as "set distcp.options.blah=''" will result in an extra "-blah" argument going into distcp, as well as any parameter set as "set distcp.options.foo='bar'" will result in an extra "-foo bar" argument going to distcp. Currently, we always pass along "-update" and "-skipcrccheck" to distcp - that is retained as defaults if no distcp.options.* params are found. If they are found, then these options are not added by default, letting the user instead provide an excplicit list. In addition, one new special option parameter, "distcp.option.privilegedUser" is being added as a special option that is not passed along to distCp. Instead, this option is used to make sure that hive will run distcp inside a impersonation context as that specified user, if this parameter is specified, and the user being impersonated is different from the current user. This, however, will require that the user have impersonation proxy privileges(something that a HS2 instance typically will have, but not a regular end-user). Note that all of these properties affect how distcp runs when it is launched by hive, but are not directly hive settings. Instead, hive will allow setting them through the use of the "set" command. > repli invocations of distcp needs additional handling > ----------------------------------------------------- > > Key: HIVE-16686 > URL: https://issues.apache.org/jira/browse/HIVE-16686 > Project: Hive > Issue Type: Sub-task > Components: repl > Reporter: Sushanth Sowmyan > Assignee: Sushanth Sowmyan > Labels: TODOC3.0 > > When REPL LOAD invokes distcp, there needs to be a way for the user invoking > REPL LOAD to pass on arguments to distcp. In addition, there is sometimes a > need for distcp to be invoked from within an impersonated context, such as > running as user "hdfs", asking distcp to preserve ownerships of individual > files. -- This message was sent by Atlassian JIRA (v6.3.15#6346)