Hi, thanks for your reply. Regarding your statement:
> If you aren't using Hive Server 2, the user acquires tokens before the query gets submitted to Yarn. So is it right to say that Beeline doesn't support this pattern, i.e. collecting HDFS delegation tokens before submitting the job? Do you know which other clients/services can support this? Also, do you know if HDFS delegation tokens can be obtained for Hive when using Tez instead of Yarn? Thank you, Julien On Sat, Sep 21, 2019 at 12:52 AM Owen O'Malley <owen.omal...@gmail.com> wrote: > If you are using Hive Server 2 through jdbc: > > - The most common way is to have the data only accessible to the > 'hive' user. Since the users don't have access to the underlying HDFS > files, Hive can enforce column/row permissions. > - The other option is to use doAs and run as the user. That requires > giving the 'hive' user proxy privileges. > > If you aren't using Hive Server 2, the user acquires tokens before the > query gets submitted to Yarn. > > There are trade offs in each of the models. > > .. Owen > > On Fri, Sep 20, 2019 at 9:37 AM Julien Phalip <jpha...@gmail.com> wrote: > >> Hi, >> >> My understanding is that the most common (perhaps the only?) way to let >> users run Hive queries on datasets stored in HDFS, is to configure Hive as >> a proxy user in the namenodes config. >> >> I'm wondering if, instead of using proxy user privileges, a Hive client >> could be configured to first collect HDFS delegation tokens for the user >> and then pass those tokens to the Hive server. That way, the Hive server >> would use the tokens to authenticate with HDFS on behalf of the user. >> >> Spark offers something similar to that with the >> spark.yarn.access.hadoopFileSystems >> <https://spark.apache.org/docs/latest/running-on-yarn.html#kerberos> >> property. By chance, is there a way to achieve the same thing for Hive when >> using a client like Beeline? >> >> Thank you, >> >> Julien >> >