Package: ruby1.8
Version: 1.8.7.72-3
Severity: important
X-Debbugs-cc: [email protected]
Hi there,
I've just managed to capture a reasonable strace of a problem that we've
been seeing with puppet on several platforms. This is on alpha, but I'm
fairly sure we've also seen it on armel and possibly hppa as well.
Basically, sleep doesn't sleep. An eventloop with a wakeup timer just
ends up in a tight loop, which means our slower machines can't do
anything except run a single ruby thread that is supposed to be
sleeping.
I am assuming this is a bug in the thread/select implementation in ruby,
but feel free to reassign if it turns out to be libc/kernel/whatever
related.
Any bugs in the test code are presumably mine - my ruby is fairly mediocre
at best, and my goal was not to write a perfect event loop but to
reproduce as faithfully as possible the event loop that puppet uses
without actually doing anything useful besides the event loop.
Cheers,
Test code (I have puppet installed, but a smaller set of libraries is
probably sufficient):
class SleepTest
require 'sync'
require "puppet/external/event-loop"
include SignalObserver
@@syncresources = {}
def timer
unless defined? @timer
@timer = EventLoop::Timer.new(
:interval => 3600,
:tolerance => 1,
:start? => true
)
EventLoop.current.monitor_timer @timer
end
@timer
end
# Return the sync object associated with a given resource.
def sync(resource)
@@syncresources[resource] ||= Sync.new
return @@syncresources[resource]
end
def threadlock(resource, type = Sync::EX)
self.sync(resource).synchronize(type) do
yield
end
end
def newtimer(hash, &block)
self.timer()
timer = nil
threadlock(:timers) do
@timers ||= []
timer = EventLoop::Timer.new(hash)
@timers << timer
if block_given?
observe_signal(timer, :alarm, &block)
end
end
# In case they need it for something else.
timer
end
def newthread(&block)
@threads ||= []
@threads << Thread.new do
yield
end
end
end
sleepy = SleepTest.new
sleepy.newthread do
begin
timer = sleepy.timer()
EventLoop.monitor_timer timer
FileTest.exist?('/bin/sh')
end
end
EventLoop.run
Strace (begins with Thread.new call):
clone(Process 31149 attached
child_stack=0x20000d70b90,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x20000d712f0, tls=0x20000d71910, child_tidptr=0x20000d712f0) =
31149
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2000054a2c0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2ac1) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0xb91 <unfinished ...>
[pid 31149] rt_sigprocmask(SIG_BLOCK, ~[RT_0 RT_1], NULL, 8) = 0
[pid 31149] nanosleep({0, 10000000}, <unfinished ...>
[pid 31148] <... osf_sigprocmask resumed> ) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x14e9) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x1c11) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0xb91) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549ed8) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e88) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000}, <unfinished ...>
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x20000549e38) = 0
[pid 31148] brk(0x1200e2000) = 0x1200e2000
[pid 31148] brk(0x1200de000) = 0x1200de000
[pid 31148] osf_sigprocmask(0x1, 0, 0x12007aaf0) = 0
[pid 31148] sigreturn() = ? (mask now [])
[pid 31148] osf_syscall(0x1, 0, 0xfd9, 0, 0x1, 0x11fa0a720) = -1 EADDRINUSE
(Address already in use)
[pid 31148] osf_syscall(0x1, 0, 0x14e9, 0, 0, 0) = -1 EADDRINUSE (Address
already in use)
[pid 31148] osf_syscall(0x1, 0, 0x1c11, 0, 0, 0) = -1 EADDRINUSE (Address
already in use)
[pid 31148] osf_syscall(0x1, 0, 0xb91, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = -1 EADDRINUSE (Address already in
use)
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000}, <unfinished ...>
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0x2, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0x20000549b18, 0, 0x2, 0) = 48
[pid 31148] osf_syscall(0x1, 0, 0x12009e270, 0, 0x2, 0) = 103
[pid 31148] pipe([512, 530563912]) = 3
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] fcntl(3, F_GETFL) = 0 (flags O_RDONLY)
[pid 31148] fstat64(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
[pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x20000026000
[pid 31148] lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] fcntl(4, F_GETFL) = 0x1 (flags O_WRONLY)
[pid 31148] fstat64(4, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
[pid 31148] mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x20000028000
[pid 31148] lseek(4, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
[pid 31148] fcntl(3, F_SETFD, FD_CLOEXEC) = 0
[pid 31148] fcntl(4, F_SETFD, FD_CLOEXEC) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dc4) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x1) = 0
[pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.rb", <unfinished ...>
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] nanosleep({0, 10000000}, <unfinished ...>
[pid 31148] <... stat64 resumed> 0x11f9fbfb0) = -1 ENOENT (No such file or
directory)
[pid 31148] stat64("/usr/local/lib/site_ruby/1.8/fcntl.so", 0x11f9fbfb0) = -1
ENOENT (No such file or directory)
[pid 31148] stat64("/usr/local/lib/ = -1 ERRNO_425 (Unknown error
425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] fork() = -1 ERRNO_425 (Unknown error 425)
[pid 31148] osf_syscall(0x2000014a280, 0x11f9fbf90, 0x11f9fbf90, 0,
0x11f9fca50, 0x11f9fca50) = -1 EOPNOTSUPP (Operation not supported)
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31148] osf_wait4(2199024607872, <unfinished ...>
[pid 31149] nanosleep({0, 10000000}, <unfinished ...>
[pid 31148] <... osf_wait4 resumed> NULL, WEXITED|WSTOPPED|0x1b0, NULL) = 6
[pid 31148] osf_syscall(0x7, 0, 0, 0, 0x8, 0x1) = 48
[pid 31148] osf_syscall(0x1, 0, 0, 0, 0x8, 0x1) = 103
[pid 31148] fcntl(6, F_GETFL) = 0x1 (flags O_WRONLY)
[pid 31148] fcntl(6, F_SETFL, O_WRONLY|O_NONBLOCK) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dc9) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2e97) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2e89) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2df7) = 0
[pid 31148] osf_sigprocmask(0x1, 0, 0x2dff) = 0
[pid 31148] select(8, [], [6], [], {0, 0}) = 1 (out [6], left {0, 0})
[pid 31148] osf_sigprocmask(0x1, 0, 0x120072b20) = 0
[pid 31148] sigreturn() = ? (mask now [FPE BUS SYS ALRM URG
TSTP USR1])
[pid 31148] exit(4) = ?
[pid 31149] <... nanosleep resumed> {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31149] nanosleep({0, 10000000}, _exit returned!
) = ? (mask now [FPE BUS SYS ALRM URG TSTP USR1])
{4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31148] sigreturn( <unfinished ...>
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31148] <... sigreturn resumed> ) = ? (mask now [])
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31148] sigreturn( <unfinished ...>
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31148] <... sigreturn resumed> ) = ? (mask now [])
[pid 31149] nanosleep({0, 10000000}, {4656511788077862093, 0}) = 0
[pid 31149] tgkill(31148, 31148, SIGVTALRM <unfinished ...>
[pid 31148] --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
[pid 31149] <... tgkill resumed> ) = 0
[pid 31148] sigreturn( <unfinished ...>
(loops until timer expires from here on)
-- System Information:
Debian Release: 5.0.2
APT prefers stable
APT policy: (500, 'stable')
Architecture: alpha
Kernel: Linux 2.6.26-2-alpha-generic
Locale: LANG=en_US.utf8, LC_CTYPE=en_US.utf8 (charmap=UTF-8) (ignored: LC_ALL
set to en_US.utf8)
Shell: /bin/sh linked to /bin/bash
Versions of packages ruby1.8 depends on:
ii libc6.1 2.7-18 GNU C Library: Shared libraries
ii libruby1.8 1.8.7.72-3 Libraries necessary to run Ruby 1.
ruby1.8 recommends no packages.
Versions of packages ruby1.8 suggests:
pn rdoc1.8 <none> (no description available)
pn ri1.8 <none> (no description available)
pn ruby1.8-examples <none> (no description available)
-- no debconf information
--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]