Yikes: I personally found that the most problematic thing was hiveserver +
zk locking, if you do not need that turn it off. Other then that we just
wrote a good nagios check..it runs a query (one that does not invoke a map
reduce job). That seems to spot the problems quickly and allow our ops to
restart the bad instance.


On Mon, Nov 18, 2013 at 5:11 PM, Roberto Congiu <roberto.con...@openx.com>wrote:

> We've also had issues with both hiveserver1 and 2 crashing because of heap
> exhaustion, but instead of restarting it periodically we took a different
> approach, that is, abstracting the part of the interface we needed, and
> implemented an adapter that implements the same method as thrift, but
> forking a shell, sending commands to it, and parsing the results.
> It is slow, but it's fast enough for our hourly process that loads data in
> hive EXTERNAL tables for which we need extra reliability.
>
> R.
>
>
> On Mon, Nov 18, 2013 at 1:24 PM, Edward Capriolo <edlinuxg...@gmail.com>wrote:
>
>> Thanks for pointing out any issue. HiveServer1 is significantly less
>> robust. We have run HS1 behind a load balancer/proxy and rotated/restarted
>> "angry" instances.
>>
>>
>> On Mon, Nov 18, 2013 at 3:59 PM, Stephen Sprague <sprag...@gmail.com>wrote:
>>
>>> A word of warning for users of HiveServer2 - version 0.11 at least. This
>>> puppy has the ability crash and/or hang your server with a memory leak.
>>>
>>> Apparently its not new since googling shows this discussed before and i
>>> see reference to a workaround here:
>>>
>>> https://cwiki.apache.org/confluence/display/Hive/Setting+up+HiveServer2
>>>
>>> Anyhoo. Consider this a Public Service Announcement. Take heed.
>>>
>>> Regards,
>>> Stephen.
>>>
>>>
>>>
>>>
>>
>
>
> --
> ----------------------------------------------------------
> Good judgement comes with experience.
> Experience comes with bad judgement.
> ----------------------------------------------------------
> Roberto Congiu - Data Engineer - OpenX
> tel: +1 626 466 1141
>

Reply via email to