Unfortunately it doesn't fix it, or at least I get the same sympthoms.

Sending SIGUSR1 prints this:

SIGUSR1 Wed Jun  3 16:53:50 2015


/home/chous/toolbox/pharo-4.0/pharo-vm/pharo
pharo VM version: 3.9-7 #1 Thu Apr  2 00:51:45 CEST 2015 gcc 4.6.3
[Production ITHB VM]
Built from: NBCoInterpreter NativeBoost-CogPlugin-EstebanLorenzano.21 uuid:
4d9b9bdf-2dfa-4c0b-99eb-5b110dadc697 Apr  2 2015
With: NBCogit NativeBoost-CogPlugin-EstebanLorenzano.21 uuid:
4d9b9bdf-2dfa-4c0b-99eb-5b110dadc697 Apr  2 2015
Revision: https://github.com/pharo-project/pharo-vm.git Commit:
32d18ba0f2db9bee7f3bdbf16bdb24fe4801cfc5 Date: 2015-03-24 11:08:14 +0100
By: Esteban Lorenzano <esteba...@gmail.com> Jenkins build #14904
Build host: Linux pharo-linux 3.2.0-31-generic-pae #50-Ubuntu SMP Fri Sep 7
16:39:45 UTC 2012 i686 i686 i386 GNU/Linux
plugin path: /home/chous/toolbox/pharo-4.0/pharo-vm/ [default:
/home/chous/toolbox/pharo-4.0/pharo-vm/]


C stack backtrace & registers:
        eax 0xff981e94 ebx 0xff981db0 ecx 0xff981e48 edx 0xff981dfc
        edi 0xff981c80 esi 0xff981c80 ebp 0xff981d18 esp 0xff981d64
        eip 0xff981f78
*[0xff981f78]
/home/chous/toolbox/pharo/pharo-vm/pharo[0x80a33a2]
/home/chous/toolbox/pharo/pharo-vm/pharo[0x80a3649]
linux-gate.so.1(__kernel_rt_sigreturn+0x0)[0xf773acc0]
/home/chous/toolbox/pharo/pharo-vm/pharo(signalSemaphoreWithIndex+0x28)[0x809d8c8]
/home/chous/toolbox/pharo/pharo-vm/pharo[0x810868c]
linux-gate.so.1(__kernel_sigreturn+0x0)[0xf773acb0]
/home/chous/toolbox/pharo/pharo-vm/pharo(signalSemaphoreWithIndex+0x5e)[0x809d8fe]
/home/chous/toolbox/pharo/pharo-vm/pharo(aioPoll+0x22f)[0x809f0af]
/home/chous/toolbox/pharo-4.0/pharo-vm/vm-display-X11(+0xe671)[0xf772a671]
/home/chous/toolbox/pharo/pharo-vm/pharo(ioRelinquishProcessorForMicroseconds+0x17)[0x80a1887]
/home/chous/toolbox/pharo/pharo-vm/pharo[0x80767fa]
[0xb4a2fe0c]
[0xb4a2d700]
[0xb53b9382]
[0xb4a2d648]
[0x5b]


All Smalltalk process stacks (active first):
Process 0xb6d930c4 priority 10
0xff9ad450 M ProcessorScheduler class>idleProcess 0xb4d935c0: a(n)
ProcessorScheduler class
0xff9ad470 I [] in ProcessorScheduler class>startUp 0xb4d935c0: a(n)
ProcessorScheduler class
0xff9ad490 I [] in BlockClosure>newProcess 0xb6d92fe8: a(n) BlockClosure

suspended processes
Process 0xb68e1984 priority 50
0xff9a6490 M WeakArray class>finalizationProcess 0xb4d93790: a(n) WeakArray
class
0xb69beb68 s [] in WeakArray class>restartFinalizationProcess
0xb68e1924 s [] in BlockClosure>newProcess

Process 0xb5ced038 priority 80
0xff9af490 M DelayMicrosecondScheduler>runTimerEventLoop 0xb5bb6f9c: a(n)
DelayMicrosecondScheduler
0xb6098314 s [] in DelayMicrosecondScheduler>startTimerEventLoop
0xb5cecfd8 s [] in BlockClosure>newProcess

Process 0xb68ec880 priority 40
0xff9b2478 M [] in UnixOSProcessAccessor>(nil) 0xb60dc6d0: a(n)
UnixOSProcessAccessor
0xff9b2490 M BlockClosure>repeat 0xb68ef2d4: a(n) BlockClosure
0xb68ef278 s [] in UnixOSProcessAccessor>(nil)
0xb68ec820 s [] in BlockClosure>newProcess

Process 0xb6d92d78 priority 60
0xff98742c M InputEventFetcher>waitForInput 0xb5a09718: a(n)
InputEventFetcher
0xff987450 M InputEventFetcher>eventLoop 0xb5a09718: a(n) InputEventFetcher
0xff987470 I [] in InputEventFetcher>installEventLoop 0xb5a09718: a(n)
InputEventFetcher
0xff987490 I [] in BlockClosure>newProcess 0xb6d92c9c: a(n) BlockClosure

Process 0xb6f25f94 priority 60
0xb6f25fcc s SmalltalkImage>lowSpaceWatcher
0xb71523e4 s [] in SmalltalkImage>installLowSpaceWatcher
0xb6f25f34 s [] in BlockClosure>newProcess

Process 0xb73a4e7c priority 30
0xff99b470 M [] in AioEventHandler>handleExceptions:readEvents:writeEvents:
0xb73a49e4: a(n) AioEventHandler
0xff99b490 I [] in BlockClosure>newProcess 0xb73a4d90: a(n) BlockClosure
Process 0xb6686c88 priority 40
0xffa073d0 M [] in Delay>wait 0xb73a63fc: a(n) Delay
0xffa073f0 M BlockClosure>ifCurtailed: 0xb73a6614: a(n) BlockClosure
0xffa0740c M Delay>wait 0xb73a63fc: a(n) Delay
0xffa07428 M PipeableOSProcess(PipeJunction)>outputOn: 0xb73a0d34: a(n)
PipeableOSProcess
0xffa07444 M PipeableOSProcess(PipeJunction)>output 0xb73a0d34: a(n)
PipeableOSProcess
0xffa0746c M [] in MCFileTreeGitRepository class>runOSProcessGitCommand:in:
0xb611fa88: a(n) MCFileTreeGitRepository class
0xffa0748c M BlockClosure>ensure: 0xb739d9dc: a(n) BlockClosure
0xff9e538c M MCFileTreeGitRepository class>runOSProcessGitCommand:in:
0xb611fa88: a(n) MCFileTreeGitRepository class
0xff9e53ac M MCFileTreeGitRepository class>runGitCommand:in: 0xb611fa88:
a(n) MCFileTreeGitRepository class
0xff9e53cc M MCFileTreeGitRepository>gitCommand:in: 0xb612926c: a(n)
MCFileTreeGitRepository
0xff9e53f4 M MCFileTreeGitRepository>gitVersionsForPackage: 0xb612926c:
a(n) MCFileTreeGitRepository
0xff9e543c M [] in MCFileTreeGitRepository>loadAllFileNames 0xb612926c:
a(n) MCFileTreeGitRepository
0xff9e5458 M FileSystemDirectoryEntry(Object)>in: 0xb71b3fe8: a(n)
FileSystemDirectoryEntry
0xff9e548c M [] in MCFileTreeGitRepository>loadAllFileNames 0xb612926c:
a(n) MCFileTreeGitRepository
0xffa04310 M BlockClosure>cull: 0xb71b4894: a(n) BlockClosure
0xffa04338 I [] in Job>run 0xb71b48b4: a(n) Job
0xffa04350 M BlockClosure>on:do: 0xb71b56b8: a(n) BlockClosure
0xffa0437c I [] in Job>run 0xb71b48b4: a(n) Job
0xffa0439c M BlockClosure>ensure: 0xb71b4980: a(n) BlockClosure
0xffa043c4 I Job>run 0xb71b48b4: a(n) Job
0xffa043e4 I MorphicUIManager(UIManager)>displayProgress:from:to:during:
0xb50a8790: a(n) MorphicUIManager
0xffa04414 I ByteString(String)>displayProgressFrom:to:during: 0xb61238d8:
a(n) ByteString
0xffa04444 M MCFileTreeGitRepository>loadAllFileNames 0xb612926c: a(n)
MCFileTreeGitRepository
0xffa04464 I MCFileTreeGitRepository>allFileNames 0xb612926c: a(n)
MCFileTreeGitRepository
0xffa0448c M MCFileTreeGitRepository>goferVersionFrom: 0xb612926c: a(n)
MCFileTreeGitRepository
0xff9e238c I
MetacelloCachingGoferResolvedReference(GoferResolvedReference)>version
0xb71b3134: a(n) MetacelloCachingGoferResolvedReference
0xff9e23a4 M MetacelloCachingGoferResolvedReference>version 0xb71b3134:
a(n) MetacelloCachingGoferResolvedReference
0xff9e23bc M [] in
MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
a(n) MetacelloFetchingMCSpecLoader
0xff9e23e0 M OrderedCollection>do: 0xb71b3234: a(n) OrderedCollection
0xff9e240c M [] in
MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
a(n) MetacelloFetchingMCSpecLoader
0xff9e2424 M BlockClosure>on:do: 0xb71b3334: a(n) BlockClosure
0xff9e244c M
MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
a(n) MetacelloFetchingMCSpecLoader
0xff9e2490 M [] in
MetacelloFetchingMCSpecLoader>linearLoadPackageSpec:gofer: 0xb706d83c: a(n)
MetacelloFetchingMCSpecLoader
0xff9ae318 M MetacelloPharo30Platform(MetacelloPlatform)>do:displaying:
0xb50e8b94: a(n) MetacelloPharo30Platform
0xff9ae338 M MetacelloFetchingMCSpecLoader>linearLoadPackageSpec:gofer:
0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
0xff9ae358 M MetacelloPackageSpec>loadUsing:gofer: 0xb706be54: a(n)
MetacelloPackageSpec
0xff9ae37c M [] in
MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>linearLoadPackageSpecs:repositories:
0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
0xff9ae3a0 M OrderedCollection>do: 0xb70c807c: a(n) OrderedCollection
0xff9ae3c0 M
MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>linearLoadPackageSpecs:repositories:
0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
0xff9ae3f0 I [] in
MetacelloFetchingMCSpecLoader>linearLoadPackageSpecs:repositories:
0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
0xff9ae410 M BlockClosure>ensure: 0xb70c813c: a(n) BlockClosure
0xff9ae438 I MetacelloLoaderPolicy>pushLoadDirective:during: 0xb706cb7c:
a(n) MetacelloLoaderPolicy
0xff9ae460 I MetacelloLoaderPolicy>pushLinearLoadDirectivesDuring:for:
0xb706cb7c: a(n) MetacelloLoaderPolicy
0xff9ae488 I
MetacelloFetchingMCSpecLoader>linearLoadPackageSpecs:repositories:
0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
0xb70c33c0 s MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>load
0xb706d898 s MetacelloMCVersionSpecLoader>load
0xb71948d0 s MetacelloMCVersion>executeLoadFromArray:
0xb719492c s [] in MetacelloMCVersion>fetchRequiredFromArray:
0xb7194988 s [] in
MetacelloPharo30Platform(MetacelloPlatform)>useStackCacheDuring:defaultDictionary:
0xb706d96c s BlockClosure>on:do:
0xb706d2d8 s
MetacelloPharo30Platform(MetacelloPlatform)>useStackCacheDuring:defaultDictionary:
0xb706d258 s [] in MetacelloMCVersion>fetchRequiredFromArray:
0xb71949e4 s BlockClosure>ensure:
0xb706d15c s [] in MetacelloMCVersion>fetchRequiredFromArray:
0xb706d1e4 s MetacelloPharo30Platform(MetacelloPlatform)>do:displaying:
0xb706d0e4 s MetacelloMCVersion>fetchRequiredFromArray:
0xb706ccc0 s [] in MetacelloMCVersion>doLoadRequiredFromArray:
0xb715327c s BlockClosure>ensure:
0xb706cc34 s MetacelloMCVersion>doLoadRequiredFromArray:
0xb71532d8 s MetacelloMCVersion>load
0xb7153334 s UndefinedObject>(nil)
0xb7153390 s OpalCompiler>evaluate
0xb706ab30 s RubSmalltalkEditor>evaluate:andDo:
0xb706a7f4 s RubSmalltalkEditor>highlightEvaluateAndDo:
0xb7152edc s [] in
GLMMorphicPharoPlaygroundRenderer(GLMMorphicPharoCodeRenderer)>actOnHighlightAndEvaluate:
0xb7152f38 s RubEditingArea(RubAbstractTextArea)>handleEdit:
0xb706a784 s [] in
GLMMorphicPharoPlaygroundRenderer(GLMMorphicPharoCodeRenderer)>actOnHighlightAndEvaluate:
0xb7152f94 s WorldState>runStepMethodsIn:
0xb7152ff0 s WorldMorph>runStepMethods
0xb706a1cc s WorldState>doOneCycleNowFor:
0xb715304c s WorldState>doOneCycleFor:
0xb71530a8 s WorldMorph>doOneCycle
0xb6686f8c s [] in MorphicUIManager>spawnNewProcess
0xb6686c28 s [] in BlockClosure>newProcess

Most recent primitives
primCreatePipe
new:
at:put:
at:put:
basicNew
basicNew:
basicNew
basicNew:
primSQFileSetBlocking:
basicNew:
basicAt:put:
basicNew:
basicAt:put:
at:put:
basicNew
primSigPipeNumber
basicNew
wait
at:put:
signal
primForwardSignal:toSemaphore:
wait
at:put:
signal
primCreatePipe
new:
at:put:
at:put:
basicNew
basicNew:
basicNew
basicNew:
primSQFileSetNonBlocking:
basicNew:
basicAt:put:
basicNew:
basicAt:put:
at:put:
basicNew
signal
basicNew:
basicAt:put:
basicNew:
basicAt:put:
at:put:
new:
basicNew
new:
replaceFrom:to:with:startingAt:
basicNew
basicNew:
primSQFileSetNonBlocking:
basicNew
stringHash:initialHash:
primOSFileHandle:
basicNew
wait
at:put:
signal
primAioEnable:forSemaphore:externalObject:
basicNew
objectAt:
basicNew:
stackp:
basicNew
primitiveResume
wait
wait
signal
wait
signal
primAioHandle:exceptionEvents:readEvents:writeEvents:
signal
basicNew:
basicAt:put:
primSQFileSetNonBlocking:
basicNew:
basicAt:put:
basicNew:
basicAt:put:
at:put:
basicNew
basicNew
wait
signal
primUTCMicrosecondsClock
+
>=
+
<
primSignal:atUTCMicroseconds:
wait
signal
wait
wait
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
primUTCMicrosecondsClock
>=
signal
+
primSignal:atUTCMicroseconds:
wait
basicNew
basicNew
basicNew
basicNew
signal
basicNew
signal
basicNew
new:
wait
new:
at:put:
at:put:
at:put:
basicNew:
at:put:
basicNew:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
basicNew
new:
at:put:
new:
basicNew:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
at:put:
basicNew:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
at:put:
at:put:
at:put:
new:
replaceFrom:to:with:startingAt:
primSizeOfPointer
new:
at:put:
at:put:
at:put:
primSizeOfPointer
basicNew:
basicNew
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
new:
basicNew
new:
at:put:
at:put:
at:put:
at:put:
at:put:
at:put:
new:
replaceFrom:to:with:startingAt:
new:
at:put:
at:put:
primGetCurrentWorkingDirectory
basicNew:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
primForkExec:stdIn:stdOut:stdErr:argBuf:argOffsets:envBuf:envOffsets:workingDir:
primGetPid
primGetPid
primGetPid
basicNew
basicNew
wait
at:put:
signal
wait
shallowCopy
new:
replaceFrom:to:with:startingAt:
signal
wait
replaceFrom:to:with:startingAt:
at:put:
signal
primCloseNoError:
primCloseNoError:
primCloseNoError:
signal
basicNew:
basicNew
basicNew
basicNew
wait
signal
primUTCMicrosecondsClock
+
>=
+
<
primSignal:atUTCMicroseconds:
wait
signal
wait
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
basicNew:
primRead:into:startingAt:count:
basicNew
signal
wait
basicNew:
basicNew
basicNew:
replaceFrom:to:with:startingAt:
replaceFrom:to:with:startingAt:
signal
basicNew
signal
basicNew
new:
wait
signal
wait
signal
primAioHandle:exceptionEvents:readEvents:writeEvents:
signal
wait
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:
relinquishProcessorForMicroseconds:

stack page bytes 4096 available headroom 3300 minimum unused headroom 2152

        (SIGUSR1)


2015-06-03 14:15 GMT+02:00 David T. Lewis <le...@mail.msen.com>:

> On Wed, Jun 03, 2015 at 07:05:15AM +0200, Thierry Goubier wrote:
> > Hi Dave,
> >
> > Le 03/06/2015 03:15, David T. Lewis a ?crit :
> > >Hi Thierry and Jose,
> > >
> > >I am reading this thread with interest and will help if I can.
> > >
> > >I do have one idea that we have not tried before. I have a theory that
> > >this may
> > >be an intermittent problem caused by SIGCHLD signals (from the external
> OS
> > >process
> > >when it exits) being missed by the
> UnixOSProcessAccessor>>grimReaperProcess
> > >that handles them.
> > >
> > >If this is happening, then I may be able to change grimReaperProcess to
> > >work around the problem.
> > >
> > >When you see the OS deadlock condition, are you able tell if your Pharo
> VM
> > >process has subprocesses in the zombie state (indicating that
> > >grimReaperProcess
> > >did not clean them up)? The unix command "ps -axf | less" will let you
> look
> > >at the process tree and that may give us a clue if this is happening.
> >
> > I found it very easy to reproduce and I do have a zombie children
> > process to the pharo process.
>
> Jose confirms this also (thanks).
>
> Can you try filing in the attached UnixOSProcessAccessor>>grimReaperProcess
> and see if it helps? I do not know if it will make a difference, but the
> idea is to put a timeout on the semaphore that is waiting for signals from
> SIGCHLD. I am hoping that if these signals are sometimes being missed, then
> the timeout will allow the process to recover from the problem.
>
>
> >
> > Interesting enough, the lock-up happens in a very specific place, a call
> > to git branch, which is a very short command returning just a few
> > characters (where all other commands have longuer outputs). Reducing the
> > frequency of the calls to git branch by a bit of caching reduces the
> > chances of a lock-up.
> >
>
> This is a good clue, and it may indicate a different kind of problem (so
> maybe I am looking in the wrong place). Ben's suggestion of adding a delay
> to the external process sounds like a good idea to help troubleshoot it.
>
> Dave
>
>
>

Reply via email to