On Wed, Jun 03, 2015 at 05:03:18PM +0200, Jose San Leandro wrote:
> Unfortunately it doesn't fix it, or at least I get the same sympthoms.

Thanks for trying it. Sorry it did not help :-/

Dave



> 
> Sending SIGUSR1 prints this:
> 
> SIGUSR1 Wed Jun  3 16:53:50 2015
> 
> 
> /home/chous/toolbox/pharo-4.0/pharo-vm/pharo
> pharo VM version: 3.9-7 #1 Thu Apr  2 00:51:45 CEST 2015 gcc 4.6.3
> [Production ITHB VM]
> Built from: NBCoInterpreter NativeBoost-CogPlugin-EstebanLorenzano.21 uuid:
> 4d9b9bdf-2dfa-4c0b-99eb-5b110dadc697 Apr  2 2015
> With: NBCogit NativeBoost-CogPlugin-EstebanLorenzano.21 uuid:
> 4d9b9bdf-2dfa-4c0b-99eb-5b110dadc697 Apr  2 2015
> Revision: https://github.com/pharo-project/pharo-vm.git Commit:
> 32d18ba0f2db9bee7f3bdbf16bdb24fe4801cfc5 Date: 2015-03-24 11:08:14 +0100
> By: Esteban Lorenzano <esteba...@gmail.com> Jenkins build #14904
> Build host: Linux pharo-linux 3.2.0-31-generic-pae #50-Ubuntu SMP Fri Sep 7
> 16:39:45 UTC 2012 i686 i686 i386 GNU/Linux
> plugin path: /home/chous/toolbox/pharo-4.0/pharo-vm/ [default:
> /home/chous/toolbox/pharo-4.0/pharo-vm/]
> 
> 
> C stack backtrace & registers:
>         eax 0xff981e94 ebx 0xff981db0 ecx 0xff981e48 edx 0xff981dfc
>         edi 0xff981c80 esi 0xff981c80 ebp 0xff981d18 esp 0xff981d64
>         eip 0xff981f78
> *[0xff981f78]
> /home/chous/toolbox/pharo/pharo-vm/pharo[0x80a33a2]
> /home/chous/toolbox/pharo/pharo-vm/pharo[0x80a3649]
> linux-gate.so.1(__kernel_rt_sigreturn+0x0)[0xf773acc0]
> /home/chous/toolbox/pharo/pharo-vm/pharo(signalSemaphoreWithIndex+0x28)[0x809d8c8]
> /home/chous/toolbox/pharo/pharo-vm/pharo[0x810868c]
> linux-gate.so.1(__kernel_sigreturn+0x0)[0xf773acb0]
> /home/chous/toolbox/pharo/pharo-vm/pharo(signalSemaphoreWithIndex+0x5e)[0x809d8fe]
> /home/chous/toolbox/pharo/pharo-vm/pharo(aioPoll+0x22f)[0x809f0af]
> /home/chous/toolbox/pharo-4.0/pharo-vm/vm-display-X11(+0xe671)[0xf772a671]
> /home/chous/toolbox/pharo/pharo-vm/pharo(ioRelinquishProcessorForMicroseconds+0x17)[0x80a1887]
> /home/chous/toolbox/pharo/pharo-vm/pharo[0x80767fa]
> [0xb4a2fe0c]
> [0xb4a2d700]
> [0xb53b9382]
> [0xb4a2d648]
> [0x5b]
> 
> 
> All Smalltalk process stacks (active first):
> Process 0xb6d930c4 priority 10
> 0xff9ad450 M ProcessorScheduler class>idleProcess 0xb4d935c0: a(n)
> ProcessorScheduler class
> 0xff9ad470 I [] in ProcessorScheduler class>startUp 0xb4d935c0: a(n)
> ProcessorScheduler class
> 0xff9ad490 I [] in BlockClosure>newProcess 0xb6d92fe8: a(n) BlockClosure
> 
> suspended processes
> Process 0xb68e1984 priority 50
> 0xff9a6490 M WeakArray class>finalizationProcess 0xb4d93790: a(n) WeakArray
> class
> 0xb69beb68 s [] in WeakArray class>restartFinalizationProcess
> 0xb68e1924 s [] in BlockClosure>newProcess
> 
> Process 0xb5ced038 priority 80
> 0xff9af490 M DelayMicrosecondScheduler>runTimerEventLoop 0xb5bb6f9c: a(n)
> DelayMicrosecondScheduler
> 0xb6098314 s [] in DelayMicrosecondScheduler>startTimerEventLoop
> 0xb5cecfd8 s [] in BlockClosure>newProcess
> 
> Process 0xb68ec880 priority 40
> 0xff9b2478 M [] in UnixOSProcessAccessor>(nil) 0xb60dc6d0: a(n)
> UnixOSProcessAccessor
> 0xff9b2490 M BlockClosure>repeat 0xb68ef2d4: a(n) BlockClosure
> 0xb68ef278 s [] in UnixOSProcessAccessor>(nil)
> 0xb68ec820 s [] in BlockClosure>newProcess
> 
> Process 0xb6d92d78 priority 60
> 0xff98742c M InputEventFetcher>waitForInput 0xb5a09718: a(n)
> InputEventFetcher
> 0xff987450 M InputEventFetcher>eventLoop 0xb5a09718: a(n) InputEventFetcher
> 0xff987470 I [] in InputEventFetcher>installEventLoop 0xb5a09718: a(n)
> InputEventFetcher
> 0xff987490 I [] in BlockClosure>newProcess 0xb6d92c9c: a(n) BlockClosure
> 
> Process 0xb6f25f94 priority 60
> 0xb6f25fcc s SmalltalkImage>lowSpaceWatcher
> 0xb71523e4 s [] in SmalltalkImage>installLowSpaceWatcher
> 0xb6f25f34 s [] in BlockClosure>newProcess
> 
> Process 0xb73a4e7c priority 30
> 0xff99b470 M [] in AioEventHandler>handleExceptions:readEvents:writeEvents:
> 0xb73a49e4: a(n) AioEventHandler
> 0xff99b490 I [] in BlockClosure>newProcess 0xb73a4d90: a(n) BlockClosure
> Process 0xb6686c88 priority 40
> 0xffa073d0 M [] in Delay>wait 0xb73a63fc: a(n) Delay
> 0xffa073f0 M BlockClosure>ifCurtailed: 0xb73a6614: a(n) BlockClosure
> 0xffa0740c M Delay>wait 0xb73a63fc: a(n) Delay
> 0xffa07428 M PipeableOSProcess(PipeJunction)>outputOn: 0xb73a0d34: a(n)
> PipeableOSProcess
> 0xffa07444 M PipeableOSProcess(PipeJunction)>output 0xb73a0d34: a(n)
> PipeableOSProcess
> 0xffa0746c M [] in MCFileTreeGitRepository class>runOSProcessGitCommand:in:
> 0xb611fa88: a(n) MCFileTreeGitRepository class
> 0xffa0748c M BlockClosure>ensure: 0xb739d9dc: a(n) BlockClosure
> 0xff9e538c M MCFileTreeGitRepository class>runOSProcessGitCommand:in:
> 0xb611fa88: a(n) MCFileTreeGitRepository class
> 0xff9e53ac M MCFileTreeGitRepository class>runGitCommand:in: 0xb611fa88:
> a(n) MCFileTreeGitRepository class
> 0xff9e53cc M MCFileTreeGitRepository>gitCommand:in: 0xb612926c: a(n)
> MCFileTreeGitRepository
> 0xff9e53f4 M MCFileTreeGitRepository>gitVersionsForPackage: 0xb612926c:
> a(n) MCFileTreeGitRepository
> 0xff9e543c M [] in MCFileTreeGitRepository>loadAllFileNames 0xb612926c:
> a(n) MCFileTreeGitRepository
> 0xff9e5458 M FileSystemDirectoryEntry(Object)>in: 0xb71b3fe8: a(n)
> FileSystemDirectoryEntry
> 0xff9e548c M [] in MCFileTreeGitRepository>loadAllFileNames 0xb612926c:
> a(n) MCFileTreeGitRepository
> 0xffa04310 M BlockClosure>cull: 0xb71b4894: a(n) BlockClosure
> 0xffa04338 I [] in Job>run 0xb71b48b4: a(n) Job
> 0xffa04350 M BlockClosure>on:do: 0xb71b56b8: a(n) BlockClosure
> 0xffa0437c I [] in Job>run 0xb71b48b4: a(n) Job
> 0xffa0439c M BlockClosure>ensure: 0xb71b4980: a(n) BlockClosure
> 0xffa043c4 I Job>run 0xb71b48b4: a(n) Job
> 0xffa043e4 I MorphicUIManager(UIManager)>displayProgress:from:to:during:
> 0xb50a8790: a(n) MorphicUIManager
> 0xffa04414 I ByteString(String)>displayProgressFrom:to:during: 0xb61238d8:
> a(n) ByteString
> 0xffa04444 M MCFileTreeGitRepository>loadAllFileNames 0xb612926c: a(n)
> MCFileTreeGitRepository
> 0xffa04464 I MCFileTreeGitRepository>allFileNames 0xb612926c: a(n)
> MCFileTreeGitRepository
> 0xffa0448c M MCFileTreeGitRepository>goferVersionFrom: 0xb612926c: a(n)
> MCFileTreeGitRepository
> 0xff9e238c I
> MetacelloCachingGoferResolvedReference(GoferResolvedReference)>version
> 0xb71b3134: a(n) MetacelloCachingGoferResolvedReference
> 0xff9e23a4 M MetacelloCachingGoferResolvedReference>version 0xb71b3134:
> a(n) MetacelloCachingGoferResolvedReference
> 0xff9e23bc M [] in
> MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
> a(n) MetacelloFetchingMCSpecLoader
> 0xff9e23e0 M OrderedCollection>do: 0xb71b3234: a(n) OrderedCollection
> 0xff9e240c M [] in
> MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
> a(n) MetacelloFetchingMCSpecLoader
> 0xff9e2424 M BlockClosure>on:do: 0xb71b3334: a(n) BlockClosure
> 0xff9e244c M
> MetacelloFetchingMCSpecLoader>resolveDependencies:nearest:into: 0xb706d83c:
> a(n) MetacelloFetchingMCSpecLoader
> 0xff9e2490 M [] in
> MetacelloFetchingMCSpecLoader>linearLoadPackageSpec:gofer: 0xb706d83c: a(n)
> MetacelloFetchingMCSpecLoader
> 0xff9ae318 M MetacelloPharo30Platform(MetacelloPlatform)>do:displaying:
> 0xb50e8b94: a(n) MetacelloPharo30Platform
> 0xff9ae338 M MetacelloFetchingMCSpecLoader>linearLoadPackageSpec:gofer:
> 0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
> 0xff9ae358 M MetacelloPackageSpec>loadUsing:gofer: 0xb706be54: a(n)
> MetacelloPackageSpec
> 0xff9ae37c M [] in
> MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>linearLoadPackageSpecs:repositories:
> 0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
> 0xff9ae3a0 M OrderedCollection>do: 0xb70c807c: a(n) OrderedCollection
> 0xff9ae3c0 M
> MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>linearLoadPackageSpecs:repositories:
> 0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
> 0xff9ae3f0 I [] in
> MetacelloFetchingMCSpecLoader>linearLoadPackageSpecs:repositories:
> 0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
> 0xff9ae410 M BlockClosure>ensure: 0xb70c813c: a(n) BlockClosure
> 0xff9ae438 I MetacelloLoaderPolicy>pushLoadDirective:during: 0xb706cb7c:
> a(n) MetacelloLoaderPolicy
> 0xff9ae460 I MetacelloLoaderPolicy>pushLinearLoadDirectivesDuring:for:
> 0xb706cb7c: a(n) MetacelloLoaderPolicy
> 0xff9ae488 I
> MetacelloFetchingMCSpecLoader>linearLoadPackageSpecs:repositories:
> 0xb706d83c: a(n) MetacelloFetchingMCSpecLoader
> 0xb70c33c0 s MetacelloFetchingMCSpecLoader(MetacelloCommonMCSpecLoader)>load
> 0xb706d898 s MetacelloMCVersionSpecLoader>load
> 0xb71948d0 s MetacelloMCVersion>executeLoadFromArray:
> 0xb719492c s [] in MetacelloMCVersion>fetchRequiredFromArray:
> 0xb7194988 s [] in
> MetacelloPharo30Platform(MetacelloPlatform)>useStackCacheDuring:defaultDictionary:
> 0xb706d96c s BlockClosure>on:do:
> 0xb706d2d8 s
> MetacelloPharo30Platform(MetacelloPlatform)>useStackCacheDuring:defaultDictionary:
> 0xb706d258 s [] in MetacelloMCVersion>fetchRequiredFromArray:
> 0xb71949e4 s BlockClosure>ensure:
> 0xb706d15c s [] in MetacelloMCVersion>fetchRequiredFromArray:
> 0xb706d1e4 s MetacelloPharo30Platform(MetacelloPlatform)>do:displaying:
> 0xb706d0e4 s MetacelloMCVersion>fetchRequiredFromArray:
> 0xb706ccc0 s [] in MetacelloMCVersion>doLoadRequiredFromArray:
> 0xb715327c s BlockClosure>ensure:
> 0xb706cc34 s MetacelloMCVersion>doLoadRequiredFromArray:
> 0xb71532d8 s MetacelloMCVersion>load
> 0xb7153334 s UndefinedObject>(nil)
> 0xb7153390 s OpalCompiler>evaluate
> 0xb706ab30 s RubSmalltalkEditor>evaluate:andDo:
> 0xb706a7f4 s RubSmalltalkEditor>highlightEvaluateAndDo:
> 0xb7152edc s [] in
> GLMMorphicPharoPlaygroundRenderer(GLMMorphicPharoCodeRenderer)>actOnHighlightAndEvaluate:
> 0xb7152f38 s RubEditingArea(RubAbstractTextArea)>handleEdit:
> 0xb706a784 s [] in
> GLMMorphicPharoPlaygroundRenderer(GLMMorphicPharoCodeRenderer)>actOnHighlightAndEvaluate:
> 0xb7152f94 s WorldState>runStepMethodsIn:
> 0xb7152ff0 s WorldMorph>runStepMethods
> 0xb706a1cc s WorldState>doOneCycleNowFor:
> 0xb715304c s WorldState>doOneCycleFor:
> 0xb71530a8 s WorldMorph>doOneCycle
> 0xb6686f8c s [] in MorphicUIManager>spawnNewProcess
> 0xb6686c28 s [] in BlockClosure>newProcess
> 
> Most recent primitives
> primCreatePipe
> new:
> at:put:
> at:put:
> basicNew
> basicNew:
> basicNew
> basicNew:
> primSQFileSetBlocking:
> basicNew:
> basicAt:put:
> basicNew:
> basicAt:put:
> at:put:
> basicNew
> primSigPipeNumber
> basicNew
> wait
> at:put:
> signal
> primForwardSignal:toSemaphore:
> wait
> at:put:
> signal
> primCreatePipe
> new:
> at:put:
> at:put:
> basicNew
> basicNew:
> basicNew
> basicNew:
> primSQFileSetNonBlocking:
> basicNew:
> basicAt:put:
> basicNew:
> basicAt:put:
> at:put:
> basicNew
> signal
> basicNew:
> basicAt:put:
> basicNew:
> basicAt:put:
> at:put:
> new:
> basicNew
> new:
> replaceFrom:to:with:startingAt:
> basicNew
> basicNew:
> primSQFileSetNonBlocking:
> basicNew
> stringHash:initialHash:
> primOSFileHandle:
> basicNew
> wait
> at:put:
> signal
> primAioEnable:forSemaphore:externalObject:
> basicNew
> objectAt:
> basicNew:
> stackp:
> basicNew
> primitiveResume
> wait
> wait
> signal
> wait
> signal
> primAioHandle:exceptionEvents:readEvents:writeEvents:
> signal
> basicNew:
> basicAt:put:
> primSQFileSetNonBlocking:
> basicNew:
> basicAt:put:
> basicNew:
> basicAt:put:
> at:put:
> basicNew
> basicNew
> wait
> signal
> primUTCMicrosecondsClock
> +
> >=
> +
> <
> primSignal:atUTCMicroseconds:
> wait
> signal
> wait
> wait
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> primUTCMicrosecondsClock
> >=
> signal
> +
> primSignal:atUTCMicroseconds:
> wait
> basicNew
> basicNew
> basicNew
> basicNew
> signal
> basicNew
> signal
> basicNew
> new:
> wait
> new:
> at:put:
> at:put:
> at:put:
> basicNew:
> at:put:
> basicNew:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> basicNew
> new:
> at:put:
> new:
> basicNew:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> at:put:
> basicNew:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> at:put:
> at:put:
> at:put:
> new:
> replaceFrom:to:with:startingAt:
> primSizeOfPointer
> new:
> at:put:
> at:put:
> at:put:
> primSizeOfPointer
> basicNew:
> basicNew
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> new:
> basicNew
> new:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> at:put:
> new:
> replaceFrom:to:with:startingAt:
> new:
> at:put:
> at:put:
> primGetCurrentWorkingDirectory
> basicNew:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> primForkExec:stdIn:stdOut:stdErr:argBuf:argOffsets:envBuf:envOffsets:workingDir:
> primGetPid
> primGetPid
> primGetPid
> basicNew
> basicNew
> wait
> at:put:
> signal
> wait
> shallowCopy
> new:
> replaceFrom:to:with:startingAt:
> signal
> wait
> replaceFrom:to:with:startingAt:
> at:put:
> signal
> primCloseNoError:
> primCloseNoError:
> primCloseNoError:
> signal
> basicNew:
> basicNew
> basicNew
> basicNew
> wait
> signal
> primUTCMicrosecondsClock
> +
> >=
> +
> <
> primSignal:atUTCMicroseconds:
> wait
> signal
> wait
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> basicNew:
> primRead:into:startingAt:count:
> basicNew
> signal
> wait
> basicNew:
> basicNew
> basicNew:
> replaceFrom:to:with:startingAt:
> replaceFrom:to:with:startingAt:
> signal
> basicNew
> signal
> basicNew
> new:
> wait
> signal
> wait
> signal
> primAioHandle:exceptionEvents:readEvents:writeEvents:
> signal
> wait
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> relinquishProcessorForMicroseconds:
> 
> stack page bytes 4096 available headroom 3300 minimum unused headroom 2152
> 
>         (SIGUSR1)
> 
> 
> 2015-06-03 14:15 GMT+02:00 David T. Lewis <le...@mail.msen.com>:
> 
> > On Wed, Jun 03, 2015 at 07:05:15AM +0200, Thierry Goubier wrote:
> > > Hi Dave,
> > >
> > > Le 03/06/2015 03:15, David T. Lewis a ?crit :
> > > >Hi Thierry and Jose,
> > > >
> > > >I am reading this thread with interest and will help if I can.
> > > >
> > > >I do have one idea that we have not tried before. I have a theory that
> > > >this may
> > > >be an intermittent problem caused by SIGCHLD signals (from the external
> > OS
> > > >process
> > > >when it exits) being missed by the
> > UnixOSProcessAccessor>>grimReaperProcess
> > > >that handles them.
> > > >
> > > >If this is happening, then I may be able to change grimReaperProcess to
> > > >work around the problem.
> > > >
> > > >When you see the OS deadlock condition, are you able tell if your Pharo
> > VM
> > > >process has subprocesses in the zombie state (indicating that
> > > >grimReaperProcess
> > > >did not clean them up)? The unix command "ps -axf | less" will let you
> > look
> > > >at the process tree and that may give us a clue if this is happening.
> > >
> > > I found it very easy to reproduce and I do have a zombie children
> > > process to the pharo process.
> >
> > Jose confirms this also (thanks).
> >
> > Can you try filing in the attached UnixOSProcessAccessor>>grimReaperProcess
> > and see if it helps? I do not know if it will make a difference, but the
> > idea is to put a timeout on the semaphore that is waiting for signals from
> > SIGCHLD. I am hoping that if these signals are sometimes being missed, then
> > the timeout will allow the process to recover from the problem.
> >
> >
> > >
> > > Interesting enough, the lock-up happens in a very specific place, a call
> > > to git branch, which is a very short command returning just a few
> > > characters (where all other commands have longuer outputs). Reducing the
> > > frequency of the calls to git branch by a bit of caching reduces the
> > > chances of a lock-up.
> > >
> >
> > This is a good clue, and it may indicate a different kind of problem (so
> > maybe I am looking in the wrong place). Ben's suggestion of adding a delay
> > to the external process sounds like a good idea to help troubleshoot it.
> >
> > Dave
> >
> >
> >

Reply via email to