I've rebuilt libqb using separated SOCKETDIR (/var/run/qb), and set hacluster:haclient ownership to this dir.
After that pacemakerd has been successfully started with all its childs: [root@ha1 /var/run/qb]# pacemakerd -fV Could not establish pacemakerd connection: Connection refused (146) info: crm_ipc_connect: Could not establish pacemakerd connection: Connection refused (146) info: get_cluster_type: Detected an active 'corosync' cluster info: read_config: Reading configure for stack: corosync notice: crm_add_logfile: Additional logging available in /var/log/cluster/corosync.log notice: main: Starting Pacemaker 1.1.8 (Build: 1f8858c): ncurses libqb-logging libqb-ipc upstart systemd corosync-native info: main: Maximum core file size is: 18446744073709551613 info: qb_ipcs_us_publish: server name: pacemakerd notice: update_node_processes: 48de70 Node 182452614 now known as ha1, was: info: start_child: Forked child 60719 for process cib info: start_child: Forked child 60720 for process stonith-ng info: start_child: Forked child 60721 for process lrmd info: start_child: Forked child 60722 for process attrd info: start_child: Forked child 60723 for process pengine info: start_child: Forked child 60724 for process crmd info: main: Starting mainloop [root@ha1 /var/run/qb]# ls -l total 0 srwxrwxrwx 1 hacluster root 0 Mar 25 11:50 attrd srwxrwxrwx 1 root root 0 Mar 25 11:43 cfg srwxrwxrwx 1 hacluster root 0 Mar 25 11:50 cib_ro srwxrwxrwx 1 hacluster root 0 Mar 25 11:50 cib_rw srwxrwxrwx 1 hacluster root 0 Mar 25 11:50 cib_shm srwxrwxrwx 1 root root 0 Mar 25 11:43 cmap srwxrwxrwx 1 root root 0 Mar 25 11:43 cpg srwxrwxrwx 1 root root 0 Mar 25 11:50 lrmd srwxrwxrwx 1 root root 0 Mar 25 11:50 pacemakerd srwxrwxrwx 1 hacluster root 0 Mar 25 11:50 pengine srwxrwxrwx 1 root root 0 Mar 25 11:43 quorum srwxrwxrwx 1 root root 0 Mar 25 11:50 stonith-ng However, libqb still can not create some files in /var/run due to insufficient permissions: Mar 25 11:50:45 [60719] cib: info: init_cs_connection_once: Connection to 'corosync': established Mar 25 11:50:45 [60719] cib: info: crm_get_peer: Node 182452614 is now known as ha1 Mar 25 11:50:45 [60719] cib: info: crm_get_peer: Node 182452614 has uuid 182452614 Mar 25 11:50:45 [60719] cib: info: qb_ipcs_us_publish: server name: cib_ro Mar 25 11:50:45 [60719] cib: info: qb_ipcs_us_publish: server name: cib_rw Mar 25 11:50:45 [60719] cib: info: qb_ipcs_us_publish: server name: cib_shm Mar 25 11:50:45 [60719] cib: info: cib_init: Starting cib mainloop Mar 25 11:50:45 [60719] cib: info: pcmk_cpg_membership: Joined[0.0] cib.182452614 Mar 25 11:50:45 [60719] cib: info: pcmk_cpg_membership: Member[0.0] cib.182452614 Mar 25 11:50:45 [60719] cib: info: pcmk_cpg_membership: Member[0.1] cib.182452614 Mar 25 11:50:46 [60719] cib: error: qb_sys_mmap_file_open: couldn't open file /var/run/qb-cib_rw-control-60719-60720-15: Permission denied (13) Mar 25 11:50:46 [60719] cib: error: qb_ipcs_us_connect: couldn't create file for mmap (60719-60720-15): Permission denied (13) Mar 25 11:50:46 [60719] cib: error: handle_new_connection: Invalid IPC credentials (60719-60720-15). Mar 25 11:50:46 [60720] stonith-ng: info: crm_ipc_connect: Could not establish cib_rw connection: Permission denied (13) Mar 25 11:50:46 [60719] cib: error: qb_sys_mmap_file_open: couldn't open file /var/run/qb-cib_shm-control-60719-60724-16: Permission denied (13) Mar 25 11:50:46 [60719] cib: error: qb_ipcs_us_connect: couldn't create file for mmap (60719-60724-16): Permission denied (13) Mar 25 11:50:46 [60719] cib: error: handle_new_connection: Invalid IPC credentials (60719-60724-16). Mar 25 11:50:46 [60724] crmd: info: crm_ipc_connect: Could not establish cib_shm connection: Permission denied (13) Mar 25 11:50:46 [60724] crmd: info: do_cib_control: Could not connect to the CIB service: Transport endpoint is not connected Mar 25 11:50:46 [60724] crmd: warning: do_cib_control: Couldn't complete CIB registration 1 times... pause and retry If someone has working setup on Linux with corosync 2.x, libqb and pacemaker 1.1.x - I'd be very appreciated for sharing some information about a places which libqb uses for its special socket files. Thanks in advance! (Can we say now that this problem is libqb-related, not pacemaker?) On Mar 25, 2013, at 15:30 , Andrei Belov <defana...@gmail.com> wrote: > Andreas, > > just tried "PCMK_ipc_type=socket pacemaker -fV" - a bunch of additional > "event_send" errors appeared: > > Mar 25 11:15:55 [33641] ha1 corosync error [MAIN ] event_send retuned -32, > expected 256! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 217! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 219! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 256! > Mar 25 11:15:55 [53980] pengine: error: qb_ipcs_us_publish: Could > not bind AF_UNIX (/var/run/pengine): Permission denied (13) > Mar 25 11:15:55 [53980] pengine: error: mainloop_add_ipc_server: Could > not start pengine IPC server: Unknown error (-13) > Mar 25 11:15:55 [53980] pengine: error: main: Couldn't start IPC > server > Mar 25 11:15:55 [53975] pacemakerd: error: pcmk_child_exit: Child process > pengine exited (pid=53980, rc=1) > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 256! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [53979] attrd: error: qb_ipcs_us_publish: Could > not bind AF_UNIX (/var/run/attrd): Permission denied (13) > Mar 25 11:15:55 [53979] attrd: error: mainloop_add_ipc_server: Could > not start attrd IPC server: Unknown error (-13) > Mar 25 11:15:55 [53979] attrd: error: main: Could not start IPC > server > Mar 25 11:15:55 [53979] attrd: error: main: Aborting startup > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [53975] pacemakerd: error: pcmk_child_exit: Child process > attrd exited (pid=53979, rc=100) > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 256! > Mar 25 11:15:55 [53976] cib: error: qb_ipcs_us_publish: Could > not bind AF_UNIX (/var/run/cib_ro): Permission denied (13) > Mar 25 11:15:55 [53976] cib: error: mainloop_add_ipc_server: Could > not start cib_ro IPC server: Unknown error (-13) > Mar 25 11:15:55 [53976] cib: error: qb_ipcs_us_publish: Could > not bind AF_UNIX (/var/run/cib_rw): Permission denied (13) > Mar 25 11:15:55 [53976] cib: error: mainloop_add_ipc_server: Could > not start cib_rw IPC server: Unknown error (-13) > Mar 25 11:15:55 [53976] cib: error: qb_ipcs_us_publish: Could > not bind AF_UNIX (/var/run/cib_shm): Permission denied (13) > Mar 25 11:15:55 [53976] cib: error: mainloop_add_ipc_server: Could > not start cib_shm IPC server: Unknown error (-13) > Mar 25 11:15:55 [53976] cib: error: cib_init: Couldnt start > all IPC channels, exiting. > Mar 25 11:15:55 [53975] pacemakerd: error: pcmk_child_exit: Child process > cib exited (pid=53976, rc=255) > Mar 25 11:15:55 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 223! > Mar 25 11:16:04 [53977] stonith-ng: error: setup_cib: Could not > connect to the CIB service: -134 fffffd7fc421a0b0 > Mar 25 11:16:04 [33641] ha1 corosync error [SERV ] event_send retuned -32, > expected 217! > Mar 25 11:16:04 [53975] pacemakerd: notice: pcmk_shutdown_worker: > Attempting to inhibit respawning after fatal error > > > # fgrep 32 /usr/include/sys/errno.h > #define EPIPE 32 /* Broken pipe */ > > > > On Mar 25, 2013, at 13:55 , "Grüninger, Andreas (LGL Extern)" > <andreas.gruenin...@lgl.bwl.de> wrote: > >> With solaris/openindiana you should use this setting >> export PCMK_ipc_type=socket >> >> Andreas >> >> -----Ursprüngliche Nachricht----- >> Von: Andrei Belov [mailto:defana...@gmail.com] >> Gesendet: Montag, 25. März 2013 10:43 >> An: pacemaker@oss.clusterlabs.org >> Betreff: [Pacemaker] solaris problem >> >> Hi folks, >> >> I'm trying to build test HA cluster on Solaris 5.11 using libqb 0.14.4, >> corosync 2.3.0 and pacemaker 1.1.8, and I'm facing a strange problem while >> starting pacemaker. >> >> Log shows the following errors: >> >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: mainloop_add_ipc_server: >> Could not start lrmd IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33720] lrmd: error: try_server_create: New >> IPC server could not be created because another lrmd process exists, sending >> shutdown command to old lrmd process. >> Mar 25 09:21:26 [33720] lrmd: error: main: Failed to allocate >> lrmd server. shutting down >> Mar 25 09:21:26 [33722] pengine: error: mainloop_add_ipc_server: >> Could not start pengine IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33722] pengine: error: main: Couldn't start IPC >> server >> Mar 25 09:21:26 [33717] pacemakerd: error: pcmk_child_exit: Child >> process lrmd exited (pid=33720, rc=255) >> Mar 25 09:21:26 [33721] attrd: error: qb_ipcs_us_publish: >> Could not bind AF_UNIX (/var/run/attrd): Permission denied (13) >> Mar 25 09:21:26 [33721] attrd: error: mainloop_add_ipc_server: >> Could not start attrd IPC server: Unknown error (-13) >> Mar 25 09:21:26 [33721] attrd: error: main: Could not start IPC >> server >> Mar 25 09:21:26 [33721] attrd: error: main: Aborting startup >> Mar 25 09:21:26 [33717] pacemakerd: error: pcmk_child_exit: Child >> process pengine exited (pid=33722, rc=1) >> Mar 25 09:21:26 [33717] pacemakerd: error: pcmk_child_exit: Child >> process attrd exited (pid=33721, rc=100) >> Mar 25 09:21:26 [33718] cib: error: qb_ipcs_us_publish: >> Could not bind AF_UNIX (/var/run/cib_ro): Permission denied (13) >> Mar 25 09:21:26 [33718] cib: error: mainloop_add_ipc_server: >> Could not start cib_ro IPC server: Unknown error (-13) >> Mar 25 09:21:26 [33718] cib: error: qb_ipcs_us_publish: >> Could not bind AF_UNIX (/var/run/cib_rw): Permission denied (13) >> Mar 25 09:21:26 [33718] cib: error: mainloop_add_ipc_server: >> Could not start cib_rw IPC server: Unknown error (-13) >> Mar 25 09:21:26 [33718] cib: error: mainloop_add_ipc_server: >> Could not start cib_shm IPC server: Unknown error (-48) >> Mar 25 09:21:26 [33718] cib: error: cib_init: Couldnt >> start all IPC channels, exiting. >> Mar 25 09:21:26 [33717] pacemakerd: error: pcmk_child_exit: Child >> process cib exited (pid=33718, rc=255) >> Mar 25 09:21:35 [33719] stonith-ng: error: setup_cib: Could not >> connect to the CIB service: -134 fffffd7fc421a0b0 >> Mar 25 09:21:35 [33717] pacemakerd: notice: pcmk_shutdown_worker: >> Attempting to inhibit respawning after fatal error >> >> Full log (in case of any things I've probably missed) is attached. >> >> I wonder to know the reason of "unknown error (-48)" - on this system 48 in >> errno.h is "ENOTSUP", but I haven't found the exact place in code where this >> may happen (so I'm not sure about that). >> >> Just for record - I'm able to run corosync on two nodes and see them >> connected without any visible problems - thus, I suppose there may be >> something wrong with either pacemaker or libqb. >> >> Any help will be greatly appreciated! >> >> Thanks, >> Andrei. >> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org