I think its important to keep supporting "hive --service metastore" for backwards compatibility. I also think it would be great to have some separation in the lib/bin if possible so that it is easier for someone who downloads Hive but just wants to deploy a standalone-metastore to identify which jars are needed to run metastore.
On Thu, Jan 25, 2018 at 8:13 AM, Alexander Kolbasov <ak...@cloudera.com> wrote: > Alan, > > While continuing shipping HMS with Hive makes sense (at least for a while), > what do you think about somehow separating lib/bin directories created in > the distro so Hive and metastore have a separate set of bin/lib dirs? > > - Alex > > On Wed, Jan 24, 2018 at 12:16 PM, Alan Gates <alanfga...@gmail.com> wrote: > > > In HIVE-17983 I have been working on packaing and start/stop scripts for > > the standalone metastore. One question this brings up is how Hive will > be > > released now, with or without the metastore. I can see two options: > > > > 1) We continue to ship the metastore with Hive. Not only does this mean > > the metastore code is in the Hive source code release and the metastore > > jars are in the Hive binary distribution, but scripts like metastore.sh > are > > still included in Hive's bin directory, so that Hive admins can still do > > 'hive --service metastore' to start the metastore. I see the following > > advantages of this: > > a) it is completely backwards compatible; > > b) it is what users would expect (I have installed many databases and > never > > been asked to first install a separate package for its data catalog or > any > > other essential piece); > > c) this will still be the metastore's most frequent use case for at least > > the near future. > > > > The disadvantage is it is error prone when Hive is set up to connect to a > > separate metastore. An operator could easily start the metastore in the > > Hive package, not realizing Hive is configured to connect to a different > > one. > > > > 2) We remove the metastore from the packaging completely like we do > Hadoop > > and require the user to install it separately. The advantages and > > disadvantages of this exactly mirror those of option 1. > > > > Based on both the 80/20 rule (most metastore users will still be single > > system Hive users) and the law of least astonishment (people expect a > > database to have a data catalog) I vote for option 1. > > > > Anyone strongly feel we should do 2 instead? > > > > Any other options I haven't considered? > > > > Alan. > > >