Re: [OMPI users] program stalls in __write_nocancel()

2008-11-06 Thread Ralph Castain
Hi Peter Given how long it takes to hit the problem, have you checked your file and disk quotas? Could be that the file is simply getting too big. I'm also a tad curious how you got valgrind to work on OSX - I was unaware it supported that environment? If all that looks okay, then the nex

[OMPI users] program stalls in __write_nocancel()

2008-11-05 Thread Peter Beerli
On some of my larger problems , my program stalls and does not continue (50 or more nodes, 'long' runs >5 hours). My program is set up as a master-worker and it seems that the master gets stuck in a write to stdout see gdb backtrace below (It took all day to get there on 50 nodes). the functi