Derby depends on a local filestore, for more flexibility and security I suggest mysql as a metastore.
- Alex On Tue, Nov 29, 2011 at 3:06 AM, Alex Holmes <grep.a...@gmail.com> wrote: > Hi, > > I'm running Hive 0.7.1 with a remote metastore (Derby) on Hadoop 0.20.2. > > Is there a reason that CREATE and DROP commands when translated into > HDFS operations are run as the remote Hive metastore user, but a LOAD > is translated into HDFS operations that are executed as the Hive > client user? If my understanding is correct, doesn't this mean that: > > 1. The Hive remote metastore must always be run as a superuser, which > is arguably a security risk. If I run the Hive remote metastore as a > non-superuser different from the Hive client user, then a LOAD DATA > LOCAL (with the HDFS umask default of 022) creates a directory chmod'd > 755, which doesn't give the Hive metastore user permissions to remove > the directory in a subsequent DROP. > > 2. The Hive client must have write permissions on the initial table > directory created by the CREATE command executed as the Hive remove > metastore user. This would only work in cases where both the remote > Hive metastore user and the client Hive user were the same user, or if > the Hive client were a superuser. In my own testing the only way I > could get this to work when they were different users (and not > superusers) was in the application of a locally written patch which > addresses HIVE-2504. > > Maybe I'm over-simplifying, but couldn't all the Hive remote metastore > HDFS operations be run as the Hive client's user/group? > > Thanks, > Alex > -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*