Hi all, I have a big Pacemaker (1.1.9-1512) cluster with 9 nodes and almost 200 virtual machines (with the same storage on the bottom). Everything is based upon KVM and libvirt. Each VM has got a location, based upon a cloned ping resource on each node that pings three hosts on the net.
The problem I got is that when I clone a VM (using virt-clone) everything works fine until I try to add a new ping check. At this time, for some reason the master ping resource of the node fails, with errors like this: Jul 30 15:34:58 kvm09 lrmd[23467]: warning: child_timeout_callback: res_ping_connections_monitor_5000 process (PID 26406) timed out We're investigating on potentially network problems (obviously the network men says that those are impossible, but when the problems happens there are sometimes high ping latencies on the node), but what I find very strange is that things breaks up ONLY when I add a location based upon ping, not for example when I add the storage's order and colocation for VM. So my two questions: 1) Are there limitations about how many ping location can be declared? 2) Is this one (one vm = one ping location) the best practice to monitor the connections of the nodes? Thanks for your help, -- RaSca Mia Mamma Usa Linux: Niente รจ impossibile da capire, se lo spieghi bene! [email protected] http://www.miamammausalinux.org _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
