But dont the clients always pick the first URI for multiple instances mentioned in "*hive.metastore.uris" *config and fallback to the others only if the first is unreachable? This way, we would still have a bottleneck, right? Can you give a little more information on your setup and how you enable load balancing? I think i am missing something here.
Thanks, Udit On Wed, Mar 30, 2016 at 3:20 PM, Gautam <gautamkows...@gmail.com> wrote: > The metastore service is a java process that is a thrift server .. so you > can point multiple such hive metastore instances with > "javax.jdo.option.ConnectionURL" poitning to the same mysql db. > > On Wed, Mar 30, 2016 at 3:11 PM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> >> >> Can you clarify this please >> >> "Have you tried putting multiple metastores behind a load balancer" >> >> Are you implying that metastore and backend DB are different entities >> here. >> >> As far as I know $HIVE_HOME/bin/hive --service metastore & starts Hive >> threads to the backend database/metastore and Hive server2 acts a gateway >> for remote access to Hive metastore through beeline or other clients >> >> There is only one metastore here namely MySQL/Oracle or others. >> >> Thanks >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 30 March 2016 at 22:53, Gautam <gautamkows...@gmail.com> wrote: >> >>> Can you elaborate on where you see the bottleneck? A general overview >>> of your access path would be useful. For instance if you'r accessing Hive >>> metastore via HiveServer2 or from webhcat using embedded cli or something >>> else. >>> >>> Have you tried putting multiple metastores behind a load balancer? It's >>> just a thrift service over mysql so can have multiple instances pointing to >>> same backend db. >>> >>> On Wed, Mar 30, 2016 at 2:28 PM, Udit Mehta <ume...@groupon.com> wrote: >>> >>>> Hi all, >>>> >>>> We are currently running Hive in production and staging with the >>>> metastore connecting to a MySql database in the backend. The traffic in >>>> production accessing the metastore is more than staging which is expected. >>>> We have had a sudden increase in traffic which has led to the metastore >>>> operation taking a lot longer than before. The same query on staging takes >>>> a lot less due to the lesser traffic on the staging cluster. >>>> >>>> We tried increasing the heap space for the metastore process as well as >>>> bumped up the memory for the mysql database. Both these changes did not >>>> seem to help much and we still see delays. Is there any other config we can >>>> increase to counter this increased traffic? I am looking at config for max >>>> threads as well but im not sure if this is the right path ahead. >>>> >>>> Im wondering if the metastore is a bottleneck here or im missing >>>> something. >>>> >>>> Looking forward to your reply, >>>> Udit >>>> >>> >>> >>> >>> -- >>> "If you really want something in this life, you have to work for it. >>> Now, quiet! They're about to announce the lottery numbers..." >>> >> >> > > > -- > "If you really want something in this life, you have to work for it. Now, > quiet! They're about to announce the lottery numbers..." >