2015-06-02 12:14 GMT+02:00 Jose San Leandro <jose.sanlean...@osoco.es>:
> Hi Thierry, > > ConfigurationOfOSProcess-ThierryGoubier.38.mcz, which corresponds to > version 4.6.2. > Ok, then this is the latest. > > Another workaround that would work for me is to be able to "resume" a > previous load attempt of a Metacello project. Or a custom "hook" in > Metacello to save the image after every dependency is successfully loaded. > Yes, this would work. I'll ask again Dave if he has any idea; the bug is hard to reproduce. Would you mind telling the linux kernel / libc version of your gentoo box? Thierry > > > 2015-06-02 11:25 GMT+02:00 Thierry Goubier <thierry.goub...@gmail.com>: > >> Hi Jose, >> >> yes, I've noticed that as well. It was, at a point, drastic (i.e. almost >> allways lock-up) on my work development laptop; it now happens far less >> often (but it does happens to me from time to time). >> >> Dave Lewis, the author of OSProcess, fixed one issue which solved most of >> the lockups I had, but not all of them. The lockup is in the interaction >> between OSProcess inside Pharo and the external shell command (i.e. it >> concerns anything which uses OSProcess), and seems like missing a signal. >> It is also machine and linux version dependent (Ubuntu 14.10 was horrible, >> 14.04 and 15.04 on the same hardware are far less sensitive), and seems to >> also depend on the load of the machine itself. >> >> By the way, which version of OSProcess you are using? >> >> Thierry >> >> >> 2015-06-02 11:10 GMT+02:00 Jose San Leandro <jose.sanlean...@osoco.es>: >> >>> Hi, >>> >>> In one of our projects we are using Pharo4. The image gets built by >>> gradle, which loads the Metacello project. Sometimes, we see the build >>> process hangs. It just don't progress. >>> >>> When adding local gitfiletree:// dependencies manually through >>> Monticello after a while Pharo gets frozen. It's not always the same >>> repository, it's not always the same number of repositories before it hangs. >>> >>> I launched the image with strace, and attached gdb to the frozen process. >>> It turns out It's waiting for a lock that gets never released. >>> >>> The environment is a 64b Gentoo Linux with enough of everything >>> (multiple monitors, multiple cores, enough RAM). >>> >>> I hope anybody could point me how to dig deeper into this. >>> >>> # gdb >>> (gdb) attach [pid] >>> [..] >>> Reading symbols from /usr/lib32/libbz2.so.1...(no debugging symbols >>> found)...done. >>> Loaded symbols for /usr/lib32/libbz2.so.1 >>> 0x0809d8bb in signalSemaphoreWithIndex () >>> (gdb) backtrace >>> #0 0x0809d8bb in signalSemaphoreWithIndex () >>> #1 0x0810868c in handleSignal () >>> #2 <signal handler called> >>> #3 0x0809d8c8 in signalSemaphoreWithIndex () >>> #4 0x0809f0af in aioPoll () >>> #5 0xf76f9671 in display_ioRelinquishProcessorForMicroseconds () from >>> /home/chous/realhome/toolbox/pharo-5.0/pharo-vm/vm-display-X11 >>> #6 0x080a1887 in ioRelinquishProcessorForMicroseconds () >>> #7 0x080767fa in primitiveRelinquishProcessor () >>> #8 0xb6fc838c in ?? () >>> #9 0xb6fc3700 in ?? () >>> #10 0xb7952882 in ?? () >>> #11 0xb6fc3648 in ?? () >>> (gdb) disassemble >>> Dump of assembler code for function handleSignal: >>> 0x081085e0 <+0>: sub $0x9c,%esp >>> 0x081085e6 <+6>: mov %ebx,0x90(%esp) >>> 0x081085ed <+13>: mov 0xa0(%esp),%ebx >>> 0x081085f4 <+20>: mov %esi,0x94(%esp) >>> 0x081085fb <+27>: mov %edi,0x98(%esp) >>> 0x08108602 <+34>: movzbl 0x8168420(%ebx),%esi >>> 0x08108609 <+41>: mov %ebx,%eax >>> 0x0810860b <+43>: mov %esi,%edx >>> 0x0810860d <+45>: call 0x81070d0 <forwardSignaltoSemaphoreAt> >>> 0x08108612 <+50>: call 0x805aae0 <pthread_self@plt> >>> 0x08108617 <+55>: mov 0x8168598,%edi >>> 0x0810861d <+61>: cmp %edi,%eax >>> 0x0810861f <+63>: je 0x8108680 <handleSignal+160> >>> 0x08108621 <+65>: lea 0x10(%esp),%esi >>> 0x08108625 <+69>: mov %esi,(%esp) >>> 0x08108628 <+72>: call 0x805b330 <sigemptyset@plt> >>> 0x0810862d <+77>: mov %ebx,0x4(%esp) >>> 0x08108631 <+81>: mov %esi,(%esp) >>> 0x08108634 <+84>: call 0x805b0c0 <sigaddset@plt> >>> 0x08108639 <+89>: movl $0x0,0x8(%esp) >>> 0x08108641 <+97>: mov %esi,0x4(%esp) >>> 0x08108645 <+101>: movl $0x0,(%esp) >>> 0x0810864c <+108>: call 0x805ada0 <pthread_sigmask@plt> >>> 0x08108651 <+113>: mov %ebx,0x4(%esp) >>> 0x08108655 <+117>: mov %edi,(%esp) >>> 0x08108658 <+120>: call 0x805b240 <pthread_kill@plt> >>> 0x0810865d <+125>: mov 0x90(%esp),%ebx >>> 0x08108664 <+132>: mov 0x94(%esp),%esi >>> 0x0810866b <+139>: mov 0x98(%esp),%edi >>> 0x08108672 <+146>: add $0x9c,%esp >>> 0x08108678 <+152>: ret >>> 0x08108679 <+153>: lea 0x0(%esi,%eiz,1),%esi >>> 0x08108680 <+160>: test %esi,%esi >>> 0x08108682 <+162>: je 0x810865d <handleSignal+125> >>> 0x08108684 <+164>: mov %esi,(%esp) >>> 0x08108687 <+167>: call 0x809d8a0 <signalSemaphoreWithIndex> >>> => 0x0810868c <+172>: jmp 0x810865d <handleSignal+125> >>> End of assembler dump. >>> (gdb) up 3 >>> (gdb) disassemble >>> Dump of assembler code for function signalSemaphoreWithIndex: >>> 0x0809d8a0 <+0>: push %esi >>> 0x0809d8a1 <+1>: xor %eax,%eax >>> 0x0809d8a3 <+3>: push %ebx >>> 0x0809d8a4 <+4>: sub $0x24,%esp >>> 0x0809d8a7 <+7>: mov 0x30(%esp),%esi >>> 0x0809d8ab <+11>: test %esi,%esi >>> 0x0809d8ad <+13>: jle 0x809d918 <signalSemaphoreWithIndex+120> >>> 0x0809d8af <+15>: mov $0x1,%edx >>> 0x0809d8b4 <+20>: lea 0x0(%esi,%eiz,1),%esi >>> 0x0809d8b8 <+24>: mfence >>> 0x0809d8bb <+27>: mov $0x0,%eax >>> 0x0809d8c0 <+32>: lock cmpxchg %edx,0x8152d80 >>> => 0x0809d8c8 <+40>: mov %eax,0x1c(%esp) >>> 0x0809d8cc <+44>: mov 0x1c(%esp),%eax >>> 0x0809d8d0 <+48>: test %eax,%eax >>> 0x0809d8d2 <+50>: jne 0x809d8b8 <signalSemaphoreWithIndex+24> >>> 0x0809d8d4 <+52>: mov 0x8152d84,%edx >>> 0x0809d8da <+58>: cmp $0x1ff,%edx >>> 0x0809d8e0 <+64>: lea 0x1(%edx),%ebx >>> 0x0809d8e3 <+67>: cmove %eax,%ebx >>> 0x0809d8e6 <+70>: mov 0x8152d88,%eax >>> 0x0809d8eb <+75>: cmp %ebx,%eax >>> 0x0809d8ed <+77>: je 0x809d920 <signalSemaphoreWithIndex+128> >>> 0x0809d8ef <+79>: mov 0x8152d84,%eax >>> 0x0809d8f4 <+84>: mov %esi,0x8152da0(,%eax,4) >>> 0x0809d8fb <+91>: mfence >>> 0x0809d8fe <+94>: mov %ebx,0x8152d84 >>> 0x0809d904 <+100>: movl $0x0,0x8152d80 >>> 0x0809d90e <+110>: call 0x807c2c0 <forceInterruptCheck> >>> 0x0809d913 <+115>: mov $0x1,%eax >>> 0x0809d918 <+120>: add $0x24,%esp >>> 0x0809d91b <+123>: pop %ebx >>> 0x0809d91c <+124>: pop %esi >>> 0x0809d91d <+125>: ret >>> 0x0809d91e <+126>: xchg %ax,%ax >>> 0x0809d920 <+128>: movl $0x810c888,(%esp) >>> 0x0809d927 <+135>: movl $0x0,0x8152d80 >>> 0x0809d931 <+145>: call 0x80a3720 <error> >>> 0x0809d936 <+150>: jmp 0x809d8ef <signalSemaphoreWithIndex+79> >>> End of assembler dump. >>> >>> Meanwhile, strace gets frozen showing this: >>> [..] >>> clone(child_stack=0, >>> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, >>> child_tidptr=0x7f63665cd9d0) = 3736 >>> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 >>> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 >>> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 >>> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 >>> rt_sigaction(SIGINT, {0x42a8a0, [], SA_RESTORER, 0x7f6365ba3ad0}, >>> {SIG_DFL, [], SA_RESTORER, 0x7f6365ba3ad0}, 8) = 0 >>> wait4(-1, 0x7ffc4ef7f7e8, 0, NULL) = ? ERESTARTSYS (To be restarted >>> if SA_RESTART is set) >>> --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} --- >>> wait4(-1, >>> >> >> >