Patrick, Gentoo devs suggested trying >=dev-libs/glib-2.34. Have you tried this already?
Regards, Vlad. On Sun, 2012-11-04 at 08:45 -0800, Patrick Irvine wrote: > Hey Guys, > > Just for the record, I just noticed this thread and it sounded familiar. > I checked my gentoo systems and I had to mask glib-2.32.4-r1 and use > glib-2.30.3 in order to get corosync/pacemaker to work. I had the same > problem. Nodes couldn't talk to each other. Sorry I didn't notice this > thread earlier, as I might have been able to help. > > Pat. > > > On 04/11/2012 3:15 AM, Vladimir Elisseev wrote: > > Thanks for the explanation. I saw coredumps in the directories you > > mentioned already. The "suspicious -r1" includes two patches over > > vanilla version of glib: > > https://bugzilla.gnome.org/show_bug.cgi?id=679306 > > http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/dev-libs/glib/files/glib-2.32.4-CVE-2012-3524.patch?view=markup > > For the moment I simply masked this particular glib version. Hopefully > > I'll be able to find time to do a complete debug as you described. > > > > Regards, > > Vlad. > > > > On Sun, 2012-11-04 at 13:54 +0300, Vladislav Bogdanov wrote: > >> 03.11.2012 18:22, Vladimir Elisseev wrote: > >>> Vladislav, > >>> > >>> Thanks for the hint! Upgrading glig from 2.30.3 to 2.32.4 triggers this > >>> behavior of corosync. Do you know where I can find more info regarding > >>> this problem? > >> That is not corosync but pacemaker, which heavily uses glib internally. > >> And glib is the only package in your list which may affect pacemaker. > >> I would say that is a regression in that specific glib version or build. > >> Library behavior changed without bumping major so-number. > >> You'd better talk to your distribution maintainers. And -r1 looks > >> suspicious in glib version you installed. Don't you know what does it mean? > >> One more note, cib exits with signal 6 (SIGABRT), which usually means > >> you hit some assert in code. That usually results in memory dump. Look > >> at /var/lib/heartbeat/cores or /var/lib/pacemaker/cores if you have > >> relevant core files for that. If not, then you need to enable coredumps. > >> Then install debuginfo packages for pacemaker and glib (that is very > >> distribution specific, so I cannot help with that). After that you can > >> analyze relevant core files with 'gdb <full_path_to_cib_binary> > >> <core_dump_file>' > >> Just run 'bt full' and that should be enough to find what exactly code > >> path caused SIGABRT. > >> > >> Vladislav > >> > >>> Vlad. > >>> > >>> On Sat, 2012-11-03 at 16:22 +0300, Vladislav Bogdanov wrote: > >>>> 03.11.2012 15:26, Vladimir Elisseev wrote: > >>>>> I've been able to reproduce the problem. Herewith I've attached > >>>>> crm_report tarballs from both nodes. Although I don't know what > >>>>> particular package triggers this problem, but below is the list of what > >>>>> has been updated. Hopefully this helps. > >>>> I bet that is glib. > >>>> > >>>> Vladislav > >>>> > >>>>> Regards, > >>>>> Vlad. > >>>>> > >>>>> Sat Nov 3 12:15:40 2012 <<< sys-apps/busybox-1.20.2 > >>>>> Sat Nov 3 12:15:42 2012 >>> sys-apps/busybox-1.20.2 > >>>>> Sat Nov 3 12:15:50 2012 <<< sys-fs/dosfstools-3.0.9 > >>>>> Sat Nov 3 12:15:52 2012 >>> sys-fs/dosfstools-3.0.12 > >>>>> Sat Nov 3 12:16:00 2012 <<< dev-lang/nasm-2.10.01 > >>>>> Sat Nov 3 12:16:02 2012 >>> dev-lang/nasm-2.10.05 > >>>>> Sat Nov 3 12:16:11 2012 <<< dev-libs/libgamin-0.1.10-r2 > >>>>> Sat Nov 3 12:16:13 2012 >>> dev-libs/libgamin-0.1.10-r3 > >>>>> Sat Nov 3 12:16:40 2012 <<< media-fonts/droid-113-r1 > >>>>> Sat Nov 3 12:16:46 2012 >>> media-fonts/droid-113-r2 > >>>>> Sat Nov 3 12:16:54 2012 <<< media-libs/libpng-1.5.10 > >>>>> Sat Nov 3 12:16:56 2012 >>> media-libs/libpng-1.5.13-r1 > >>>>> Sat Nov 3 12:17:04 2012 <<< app-arch/unzip-6.0-r1 > >>>>> Sat Nov 3 12:17:05 2012 >>> app-arch/unzip-6.0-r3 > >>>>> Sat Nov 3 12:17:12 2012 <<< app-arch/rpm2targz-9.0.0.4g > >>>>> Sat Nov 3 12:17:14 2012 >>> app-arch/rpm2targz-9.0.0.5g > >>>>> Sat Nov 3 12:17:22 2012 <<< app-arch/pbzip2-1.1.5 > >>>>> Sat Nov 3 12:17:24 2012 >>> app-arch/pbzip2-1.1.8 > >>>>> Sat Nov 3 12:17:34 2012 <<< app-arch/zip-3.0 > >>>>> Sat Nov 3 12:17:35 2012 >>> app-arch/zip-3.0-r1 > >>>>> Sat Nov 3 12:17:43 2012 <<< sys-process/htop-1.0.1 > >>>>> Sat Nov 3 12:17:45 2012 >>> sys-process/htop-1.0.1-r1 > >>>>> Sat Nov 3 12:17:55 2012 <<< media-libs/tiff-4.0.2 > >>>>> Sat Nov 3 12:17:57 2012 >>> media-libs/tiff-4.0.2-r1 > >>>>> Sat Nov 3 12:18:04 2012 <<< net-ftp/tftp-hpa-5.1 > >>>>> Sat Nov 3 12:18:06 2012 >>> net-ftp/tftp-hpa-5.2 > >>>>> Sat Nov 3 12:18:18 2012 <<< media-video/ffmpeg-0.10.3 > >>>>> Sat Nov 3 12:18:20 2012 >>> media-video/ffmpeg-0.10.3 > >>>>> Sat Nov 3 12:18:35 2012 <<< sys-devel/gettext-0.18.1.1-r1 > >>>>> Sat Nov 3 12:18:37 2012 >>> sys-devel/gettext-0.18.1.1-r3 > >>>>> Sat Nov 3 12:18:44 2012 <<< app-admin/logrotate-3.8.1 > >>>>> Sat Nov 3 12:18:46 2012 >>> app-admin/logrotate-3.8.2 > >>>>> Sat Nov 3 12:18:54 2012 <<< media-libs/libwebp-0.1.3 > >>>>> Sat Nov 3 12:18:55 2012 >>> media-libs/libwebp-0.2.0 > >>>>> Sat Nov 3 12:19:03 2012 <<< dev-perl/Convert-ASN1-0.220.0 > >>>>> Sat Nov 3 12:19:05 2012 >>> dev-perl/Convert-ASN1-0.260.0 > >>>>> Sat Nov 3 12:19:13 2012 <<< dev-perl/net-server-0.97 > >>>>> Sat Nov 3 12:19:15 2012 >>> dev-perl/net-server-2.6.0 > >>>>> Sat Nov 3 12:19:24 2012 <<< dev-perl/Config-IniFiles-2.710.0 > >>>>> Sat Nov 3 12:19:26 2012 >>> dev-perl/Config-IniFiles-2.760.0 > >>>>> Sat Nov 3 12:19:33 2012 <<< dev-perl/HTTP-Date-6.0.0 > >>>>> Sat Nov 3 12:19:35 2012 >>> dev-perl/HTTP-Date-6.20.0 > >>>>> Sat Nov 3 12:19:44 2012 <<< sys-boot/syslinux-4.06_pre11 > >>>>> Sat Nov 3 12:19:46 2012 >>> sys-boot/syslinux-4.06 > >>>>> Sat Nov 3 12:20:05 2012 <<< dev-libs/glib-2.30.3 > >>>>> Sat Nov 3 12:20:08 2012 >>> dev-libs/glib-2.32.4-r1 > >>>>> Sat Nov 3 12:20:16 2012 <<< dev-util/pkgconfig-0.27 > >>>>> Sat Nov 3 12:20:18 2012 >>> dev-util/pkgconfig-0.27.1 > >>>>> Sat Nov 3 12:20:28 2012 <<< net-analyzer/jnettop-0.13.0-r1 > >>>>> Sat Nov 3 12:20:29 2012 >>> net-analyzer/jnettop-0.13.0-r1 > >>>>> Sat Nov 3 12:20:41 2012 <<< x11-libs/pango-1.29.4 > >>>>> Sat Nov 3 12:20:43 2012 >>> x11-libs/pango-1.30.1 > >>>>> Sat Nov 3 12:20:53 2012 <<< net-analyzer/rrdtool-1.4.5-r1 > >>>>> Sat Nov 3 12:20:56 2012 >>> net-analyzer/rrdtool-1.4.7-r1 > >>>>> Sat Nov 3 12:21:03 2012 <<< app-shells/gentoo-bashcomp-20101217 > >>>>> Sat Nov 3 12:21:05 2012 >>> app-shells/gentoo-bashcomp-20101217-r1 > >>>>> Sat Nov 3 12:21:12 2012 <<< dev-perl/MIME-tools-5.502.0 > >>>>> Sat Nov 3 12:21:14 2012 >>> dev-perl/MIME-tools-5.503.0 > >>>>> Sat Nov 3 12:21:24 2012 <<< dev-perl/Convert-TNEF-0.170.0 > >>>>> Sat Nov 3 12:21:26 2012 >>> dev-perl/Convert-TNEF-0.180.0 > >>>>> Sat Nov 3 12:21:35 2012 <<< net-misc/curl-7.25.0-r1 > >>>>> Sat Nov 3 12:21:36 2012 >>> net-misc/curl-7.26.0 > >>>>> Sat Nov 3 12:21:51 2012 <<< mail-mta/postfix-2.9.3 > >>>>> Sat Nov 3 12:21:53 2012 >>> mail-mta/postfix-2.9.4 > >>>>> Sat Nov 3 12:22:01 2012 <<< dev-perl/Net-SSLeay-1.360.0 > >>>>> Sat Nov 3 12:22:03 2012 >>> dev-perl/Net-SSLeay-1.480.0-r1 > >>>>> Sat Nov 3 12:22:12 2012 <<< sys-auth/nss_ldap-264-r1 > >>>>> Sat Nov 3 12:22:14 2012 >>> sys-auth/nss_ldap-265-r1 > >>>>> Sat Nov 3 12:22:25 2012 <<< net-mail/fetchmail-6.3.21 > >>>>> Sat Nov 3 12:22:27 2012 >>> net-mail/fetchmail-6.3.22 > >>>>> Sat Nov 3 12:22:37 2012 <<< net-misc/dhcp-4.2.4_p1 > >>>>> Sat Nov 3 12:22:39 2012 >>> net-misc/dhcp-4.2.4_p2 > >>>>> Sat Nov 3 12:22:48 2012 <<< net-analyzer/tcpdump-3.9.8-r1 > >>>>> Sat Nov 3 12:22:50 2012 >>> net-analyzer/tcpdump-4.3.0 > >>>>> Sat Nov 3 12:23:07 2012 <<< dev-util/cmake-2.8.8-r3 > >>>>> Sat Nov 3 12:23:09 2012 >>> dev-util/cmake-2.8.9 > >>>>> Sat Nov 3 12:23:21 2012 <<< dev-vcs/subversion-1.6.17-r7 > >>>>> Sat Nov 3 12:23:24 2012 >>> dev-vcs/subversion-1.6.17-r7 > >>>>> Sat Nov 3 12:27:56 2012 <<< media-gfx/imagemagick-6.7.8.7 > >>>>> Sat Nov 3 12:27:58 2012 >>> media-gfx/imagemagick-6.7.8.7 > >>>>> > >>>>> > >>>>> > >>>>> On Thu, 2012-11-01 at 07:08 +0100, Vladimir Elisseev wrote: > >>>>>> Yes, hb_report is there, thanks! > >>>>>> > >>>>>> On Thu, 2012-11-01 at 11:40 +1100, Andrew Beekhof wrote: > >>>>>>> On Tue, Oct 30, 2012 at 4:35 PM, Vladimir Elisseev <vo...@vovan.nl> > >>>>>>> wrote: > >>>>>>>> Thanks for trying to help! Currently I can't provide crm_report from > >>>>>>>> the > >>>>>>>> failed node, as I've decided to restore the complete node from > >>>>>>>> backup. > >>>>>>>> The versions I use are corosync-1.3.0 and pacemaker-1.0.10. Actually > >>>>>>>> the > >>>>>>>> problem occurred after updating quiet a few system packages, but all > >>>>>>>> the > >>>>>>>> cluster related software was untouched. I've found exactly the same > >>>>>>>> issue described in the mailing list earlier: > >>>>>>>> http://www.gossamer-threads.com/lists/linuxha/pacemaker/77881?do=post_view_threaded#77881 > >>>>>>>> At least symptoms are exactly the same as well as pasted log files. > >>>>>>>> I've > >>>>>>>> tried enable debug logging as well and saw that crm tries to connect > >>>>>>>> to > >>>>>>>> cib sockets (/var/run/crm_*) too early (IMO) and fails because cib > >>>>>>>> wasn't started yet. > >>>>>>>> I'm planning to repeat update of these system again, but I'll do this > >>>>>>>> more carefully in order to understand which particular package leads > >>>>>>>> to > >>>>>>>> this behavior. BTW, how can I create crm_report? I can't find this > >>>>>>>> binary anywhere on the system. > >>>>>>> Its included in subsequent 1.0.x releases. > >>>>>>> You should have hb_report available though. > >>>>>>> > >>>>>>>> Let me know what kind of input you'll > >>>>>>>> need if I'll be able to reproduce this problem. > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Vlad. > >>>>>>>> > >>>>>>>> > >>>>>>>> On Tue, 2012-10-30 at 16:00 +1100, Andrew Beekhof wrote: > >>>>>>>>> On Sun, Oct 28, 2012 at 9:05 PM, Vladimir Elisseev <vo...@vovan.nl> > >>>>>>>>> wrote: > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> I'm having problem that after reboot one cluster node can't join > >>>>>>>>>> cluster > >>>>>>>>>> anymore. Form the log file I can't understand what actually is > >>>>>>>>>> going on. > >>>>>>>>>> I only can see, that cib and crm both are respawned frequently. I'd > >>>>>>>>>> appreciate any help. Below is relevant part of the log file: > >>>>>>>>> I appreciate that you're trying to keep it brief, but problems often > >>>>>>>>> originate much earlier than people suspect. > >>>>>>>>> Can you instead attach a crm_report tarball, that will have > >>>>>>>>> everything > >>>>>>>>> (from both nodes) that we need to be able to help. > >>>>>>>>> > >>>>>>>>> What version is this btw? > >>>>>>>>> > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10646]: info: cib_server_process_diff: > >>>>>>>>>> Requesting re-sync from peer > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_diff_notify: > >>>>>>>>>> Local-only Change (client:crmd, call: 4770): -1.-1.-1 (Application > >>>>>>>>>> of an update diff failed, requesting a full refresh) > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10653]: info: retrieveCib: Reading > >>>>>>>>>> cluster configuration from: /var/lib/heartbeat/crm/cib.qJTUAV > >>>>>>>>>> (digest: /var/lib/heartbeat/crm/cib.XwOKXQ) > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10646]: WARN: cib_server_process_diff: > >>>>>>>>>> Not applying diff 0.1298.5 -> 0.1299.1 (sync in progress) > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10646]: info: cib_replace_notify: > >>>>>>>>>> Local-only Replace: -1.-1.-1 from srv1 > >>>>>>>>>> Oct 28 10:52:22 corosync [pcmk]: ] info: pcmk_ipc_exit: Client > >>>>>>>>>> cib (conn=0x1837340, async-conn=0x1837340) left > >>>>>>>>>> Oct 28 10:52:22 corosync [pcmk]: ] ERROR: pcmk_wait_dispatch: > >>>>>>>>>> Child process cib terminated with signal 6 (pid=10646, core=true) > >>>>>>>>>> Oct 28 10:52:22 corosync [pcmk]: ] notice: pcmk_wait_dispatch: > >>>>>>>>>> Respawning failed child process: cib > >>>>>>>>>> Oct 28 10:52:22 corosync [pcmk]: ] info: spawn_child: Forked > >>>>>>>>>> child 10656 for process cib > >>>>>>>>>> Oct 28 10:52:22 srv2 cib: [10656]: info: Invoked: > >>>>>>>>>> /usr/lib64/heartbeat/cib > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regards, > >>>>>>>>>> Vlad. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>>>>>> > >>>>>>>>>> Project Home: http://www.clusterlabs.org > >>>>>>>>>> Getting started: > >>>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>>>>>> Bugs: http://bugs.clusterlabs.org > >>>>>>>>> _______________________________________________ > >>>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>>>>> > >>>>>>>>> Project Home: http://www.clusterlabs.org > >>>>>>>>> Getting started: > >>>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>>>>> Bugs: http://bugs.clusterlabs.org > >>>>>>>> > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>>>> > >>>>>>>> Project Home: http://www.clusterlabs.org > >>>>>>>> Getting started: > >>>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>>>> Bugs: http://bugs.clusterlabs.org > >>>>>>> _______________________________________________ > >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>>> > >>>>>>> Project Home: http://www.clusterlabs.org > >>>>>>> Getting started: > >>>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>>> Bugs: http://bugs.clusterlabs.org > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>>> > >>>>>> Project Home: http://www.clusterlabs.org > >>>>>> Getting started: > >>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>>> Bugs: http://bugs.clusterlabs.org > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>>> > >>>>> Project Home: http://www.clusterlabs.org > >>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>>> Bugs: http://bugs.clusterlabs.org > >>>>> > >>>> > >>>> _______________________________________________ > >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>>> > >>>> Project Home: http://www.clusterlabs.org > >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>>> Bugs: http://bugs.clusterlabs.org > >>> > >>> > >>> _______________________________________________ > >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >>> > >>> Project Home: http://www.clusterlabs.org > >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >>> Bugs: http://bugs.clusterlabs.org > >>> > >> > >> _______________________________________________ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org