Hello 

My issue is whenever my nfs server becomes slow to respond, ACS just bloody 
reboots ALL hosts servers, not just the once running vms with volumes attached 
to the slow nfs server. Recently, i've decided to remove some of the old 
snapshots to free up some disk space. I've deleted about a dozen snapshots and 
I was monitoring the nfs server for progress. At no point did the nfs server 
lost the connectivity, it just became a bit slow and under load. By slow I mean 
i was still able to list files on the nfs mount point and the ssh session was 
still working okay. It was just taking a few more seconds to respond when it 
comes to nfs file listings, creation, deletion, etc. However, the ACS agent has 
just rebooted every single host server, killing all running guests and system 
vms. In my case, I only have two guests with volumes on the nfs server. The 
rest of the vms are running off rbd storage. Yet, all host servers were 
rebooted, even those which were not running guests with nfs volumes. 

Ever since i've started using ACS, it was always pretty dumb in correctly 
determining if the nfs storage is still alive. I would say it has done the 
maniac reboot everything type of behaviour at least 5 times in the past 3 
years. So, in the previous versions of ACS i've just modified the 
kvmheartbeat.sh and hashed out the line with "reboot" as these reboots were 
just pissing everyone off. 

After upgrading to ACS 4.5.x that script has no reboot command and I was 
wondering if it is still possible to instruct the kvmheartbeat script not to 
reboot the host servers? 

Thanks for your advice. 

Andrei 

Reply via email to