Hi, I've recently started to have a problem where some of my clients puppetd processes are locking up (the puppetdlock file is several hours old). My server is running puppet 2.7.12 on Centos 6.2 and my clients are running puppet 2.7.12 on Scientific Linux 6.2. If I check the puppetdlock file, it contains the pid of the currently "running" puppet. If I restart puppetd, it's fine for a while, but sooner or later I end up in the same state. If I run strace against the puppetd, I get:
# strace -p 10726 Process 10726 attached - interrupt to quit select(8, [7], NULL, NULL, {1, 560249}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(8, [7], NULL, NULL, {2, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(8, [7], NULL, NULL, {2, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(8, [7], NULL, NULL, {2, 0}) = 0 (Timeout) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 ^C <unfinished ...> Process 10726 detached If I run lsof, I get: # lsof -p 10726 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME puppetd 10726 root cwd DIR 8,1 4096 2 / puppetd 10726 root rtd DIR 8,1 4096 2 / puppetd 10726 root txt REG 8,1 10576 8151417 /usr/bin/ruby [...] puppetd 10726 root mem REG 8,1 26050 8153796 /usr/lib64/gconv/gconv-modules.cache puppetd 10726 root 0r CHR 1,3 0t0 3820 /dev/null puppetd 10726 root 1w CHR 1,3 0t0 3820 /dev/null puppetd 10726 root 2w CHR 1,3 0t0 3820 /dev/null puppetd 10726 root 3r FIFO 0,8 0t0 17283753 pipe puppetd 10726 root 4w FIFO 0,8 0t0 17283753 pipe puppetd 10726 root 5u unix 0xffff88013680b0c0 0t0 17283804 socket puppetd 10726 root 6u REG 8,1 6045 3145906 /var/log/puppet/http.log puppetd 10726 root 7u IPv4 17283830 0t0 TCP *:8139 (LISTEN) If I look at what puppet is running: # ps -elfw | grep 10726 5 S root 10726 1 0 81 1 - 61549 poll_s 15:15 ? 00:00:17 /usr/bin/ruby /usr/sbin/puppetd --debug --verbose 0 Z root 11429 10726 0 81 1 - 0 exit 15:39 ? 00:00:00 [sh] <defunct> Help? ...dave -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.