Running mysql as the metastore doesn't change the behavior of the HDFS operations, and more importantly who (the ugi) they are executed as.
Does anyone have any thoughts as to why Hive HDFS operations are run as different users? Many thoughts, Alex On Tue, Nov 29, 2011 at 2:47 AM, Alexander C.H. Lorenz <wget.n...@googlemail.com> wrote: > Derby depends on a local filestore, for more flexibility and security I > suggest mysql as a metastore. > - Alex > > On Tue, Nov 29, 2011 at 3:06 AM, Alex Holmes <grep.a...@gmail.com> wrote: >> >> Hi, >> >> I'm running Hive 0.7.1 with a remote metastore (Derby) on Hadoop 0.20.2. >> >> Is there a reason that CREATE and DROP commands when translated into >> HDFS operations are run as the remote Hive metastore user, but a LOAD >> is translated into HDFS operations that are executed as the Hive >> client user? If my understanding is correct, doesn't this mean that: >> >> 1. The Hive remote metastore must always be run as a superuser, which >> is arguably a security risk. If I run the Hive remote metastore as a >> non-superuser different from the Hive client user, then a LOAD DATA >> LOCAL (with the HDFS umask default of 022) creates a directory chmod'd >> 755, which doesn't give the Hive metastore user permissions to remove >> the directory in a subsequent DROP. >> >> 2. The Hive client must have write permissions on the initial table >> directory created by the CREATE command executed as the Hive remove >> metastore user. This would only work in cases where both the remote >> Hive metastore user and the client Hive user were the same user, or if >> the Hive client were a superuser. In my own testing the only way I >> could get this to work when they were different users (and not >> superusers) was in the application of a locally written patch which >> addresses HIVE-2504. >> >> Maybe I'm over-simplifying, but couldn't all the Hive remote metastore >> HDFS operations be run as the Hive client's user/group? >> >> Thanks, >> Alex > > > > -- > Alexander Lorenz > http://mapredit.blogspot.com > P Think of the environment: please don't print this email unless you really > need to. > >