Sorry, my fault. Forgot to include /usr/lib/lcrso/pacemaker.lcrso in my deb package.
-- Best regards, Sergey Arlashin On Jan 7, 2015, at 2:18 PM, Sergey Arlashin <sergeyarl.maill...@gmail.com> wrote: > After installing 1.1.12 on one of my nodes in staging environment I see the > following error in corosync.log > > Jan 7 10:05:30 lb-node1 corosync[17022]: [SERV ] Service failed to load > 'pacemaker'. > > and also cannot get crm_mon to show any info. > > # crm_mon -1 > Connection to cluster failed: Transport endpoint is not connected > > # crm status > ERROR: status: crm_mon exited with code 107. Output: 'Connection to cluster > failed: Transport endpoint is not connected' > > The same thing happened with 1.1.11 (I rebuilt 1.1.11 package from Ubuntu > 14.04 for 12.04 that we're using). > > > -- > Best regards, > Sergey Arlashin > > > > > > > > > On Jan 7, 2015, at 5:22 AM, Andrew Beekhof <and...@beekhof.net> wrote: > >> >>> On 7 Jan 2015, at 7:58 am, Sergey Arlashin <sergeyarl.maill...@gmail.com> >>> wrote: >>> >>> And one more question - can pacemaker 1.1.12 be used together with corosync >>> 1.4.7 ? >> >> It can be, depends entirely on which version of corosync it was built >> against. >> >>> Or do I need to install corosync 2.x ? >> >> Wouldn't be a bad idea while you're at it >> >>> >>> -- >>> Best regards, >>> Sergey Arlashin >>> >>> >>> On Jan 6, 2015, at 11:04 AM, Sergey Arlashin <sergeyarl.maill...@gmail.com> >>> wrote: >>> >>>> Thank you! >>>> I'll try 1.1.12. >>>> >>>> -- >>>> Best regards, >>>> Sergey Arlashin >>>> >>>> >>>> On Jan 6, 2015, at 3:23 AM, Andrew Beekhof <and...@beekhof.net> wrote: >>>> >>>>> Yeah, I can imagine 1.1.6 behaving like this. >>>>> I'd highly recommend 1.1.12 >>>>> >>>>>> On 5 Jan 2015, at 5:14 pm, Sergey Arlashin >>>>>> <sergeyarl.maill...@gmail.com> wrote: >>>>>> >>>>>> Pacemaker 1.1.6 >>>>>> >>>>>> It runs on Ubuntu 12.04 LTS 64bit. >>>>>> >>>>>> Linux lb-node1 3.11.0-23-generic #40~precise1-Ubuntu SMP Wed Jun 4 >>>>>> 22:06:36 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux >>>>>> >>>>>> -- >>>>>> Best regards, >>>>>> Sergey Arlashin >>>>>> >>>>>> >>>>>> On Jan 5, 2015, at 7:59 AM, Andrew Beekhof <and...@beekhof.net> wrote: >>>>>> >>>>>>> pacemaker version? it looks familiar but it depends on the version >>>>>>> number. >>>>>>> >>>>>>>> On 29 Dec 2014, at 10:24 pm, Sergey Arlashin >>>>>>>> <sergeyarl.maill...@gmail.com> wrote: >>>>>>>> >>>>>>>> Hi! >>>>>>>> Recently I've noticed that one of my nodes had OFFLINE status in 'crm >>>>>>>> status' output. But it actually was not. I could ssh on this node. I >>>>>>>> could get 'crm status' from that node's console. After some time it >>>>>>>> became online. It happened several times without any obvious reason >>>>>>>> with other nodes. >>>>>>>> >>>>>>>> Still no error of fatal messages in logs. The only warning messages I >>>>>>>> could get from corosync.log were the following: >>>>>>>> >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1346 -> 0.233.1347 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1347 -> 0.233.1348 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1348 -> 0.233.1349 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1349 -> 0.233.1350 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1350 -> 0.233.1351 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1351 -> 0.233.1352 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1352 -> 0.233.1353 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1353 -> 0.233.1354 not applied to 0.233.1354: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: >>>>>>>> Update 491 for last-failure-Cachier=1419729443 failed: Application of >>>>>>>> an update diff failed >>>>>>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: >>>>>>>> Update 494 for fail-count-Cachier=1 failed: Application of an update >>>>>>>> diff failed >>>>>>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: >>>>>>>> Update 497 for probe_complete=true failed: Application of an update >>>>>>>> diff failed >>>>>>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: >>>>>>>> Update 500 for last-failure-Cachier=1419729443 failed: Application of >>>>>>>> an update diff failed >>>>>>>> Dec 29 10:56:34 lb-node2 attrd: [2240]: WARN: attrd_cib_callback: >>>>>>>> Update 503 for fail-count-Cachier=1 failed: Application of an update >>>>>>>> diff failed >>>>>>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1338 -> 0.233.1339 not applied to 0.233.1382: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1339 -> 0.233.1340 not applied to 0.233.1382: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1340 -> 0.233.1341 not applied to 0.233.1382: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1341 -> 0.233.1342 not applied to 0.233.1382: current >>>>>>>> "num_updates" is greater than required >>>>>>>> Dec 29 10:56:37 lb-node2 cib: [2238]: WARN: cib_process_diff: Diff >>>>>>>> 0.233.1342 -> 0.233.1343 not applied to 0.233.1382: current >>>>>>>> "num_updates" is greater than required >>>>>>>> >>>>>>>> After exploring corosync processes with ps I found out that on all my >>>>>>>> nodes there are zombie corosync procs like: >>>>>>>> >>>>>>>> root 13892 0.0 0.0 0 0 ? Z Dec26 0:04 >>>>>>>> [corosync] <defunct> >>>>>>>> root 21793 0.0 0.0 0 0 ? Z Dec26 0:00 >>>>>>>> [corosync] <defunct> >>>>>>>> root 27009 1.3 1.0 714292 10784 ? Ssl Dec18 223:38 >>>>>>>> /usr/sbin/corosync >>>>>>>> >>>>>>>> Is it ok to have zombie corosync procs on nodes? Or does it suggest >>>>>>>> that something wrong is going on ? >>>>>>>> >>>>>>>> Thanks in advance >>>>>>>> >>>>>>>> -- >>>>>>>> Best regards, >>>>>>>> Sergey Arlashin >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>>> >>>>>>>> Project Home: http://www.clusterlabs.org >>>>>>>> Getting started: >>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>> >>>>> Project Home: http://www.clusterlabs.org >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>>> Bugs: http://bugs.clusterlabs.org >>>> >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org