If i remember well, this is old bug, has been fixed 2012/12/7 Piotr Jewiec <pi...@jewiec.net>
> Hi, > > I have a corosync/pacemaker cluster running on Ubuntu 10.04.2. The > following error is getting appended to the syslog: > > Dec 6 20:44:46 filer-1 crmd: [2970]: ERROR: socket_client_channel_new: > socket: Too many open files > Dec 6 20:44:46 filer-1 crmd: [2970]: ERROR: > init_client_ipc_comms_**nodispatch: > Could not access channel on: /var/run/crm/pengine > Dec 6 20:44:46 filer-1 crmd: [2970]: WARN: do_pe_control: Setup of client > connection failed, not adding channel to mainloop > Dec 6 20:44:46 filer-1 crmd: [2970]: WARN: do_log: FSA: Input I_FAIL from > do_pe_control() received in state S_INTEGRATION > Dec 6 20:44:46 filer-1 crmd: [2970]: info: do_dc_join_offer_all: join-24: > Waiting on 2 outstanding join acks > Dec 6 20:44:46 filer-1 crmd: [2970]: info: do_dc_takeover: Taking over DC > status for this partition > > > root@filer-1:~# lsof -p `pidof crmd` | grep socket | wc -l > 1019 > > root@filer-1:~# cat /proc/2970/limits | grep 'open files' > Max open files 1024 1024 files > > I almost fainted when I saw this one :) > > crm(live)# status > ============ > Last updated: Fri Dec 7 06:38:48 2012 > Stack: openais > Current DC: filer-1 - partition with quorum > Version: 1.0.8-**042548a451fce8400660f6031f4da6**f0223dd5dd > 2 Nodes configured, 2 expected votes > 11 Resources configured. > ============ > > OFFLINE: [ filer-2 filer-1 ] > > As far as I'm concerned killall -9 crmd will release used FDs. Does anyone > has any idea how this will work? I tested killing crmd on another cluster > (without this problem) and all resources were migrated to second node. What > can possibly happen in this case where cluster communication is busted? > Anyone ever dealt with similar problem? Resources are currently running on > filer-1, a node which had been MASTER nefore this problem occurred. > > Packages: > > pacemaker - Version: 1.0.8+hg15494-2ubuntu2 > corosync - Version: 1.2.0-0ubuntu1 > cluster-glue - Version: 1.0.5-1 > libcorosync4 - Version: 1.2.0-0ubuntu1 > libheartbeat2 - Version: 1:3.0.3-1ubuntu1 > > Any help/advice would be really appreciated :) > -- > -- > Piotr Jewiec > > ______________________________**_________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/**mailman/listinfo/pacemaker<http://oss.clusterlabs.org/mailman/listinfo/pacemaker> > > Project Home: http://www.clusterlabs.org > Getting started: > http://www.clusterlabs.org/**doc/Cluster_from_Scratch.pdf<http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > Bugs: http://bugs.clusterlabs.org > -- esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org