[O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-13 Thread Glenn Morris
I'm having trouble with an application (CosmoMC; ) that can use both OpenMPI and OpenMP. I have several Opteron boxes, each with 2 * dual core CPUs. I want to run the application with 4 MPI threads (one per box), each of which in turn splits into 4 OpenMP threads

Re: [O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-16 Thread Glenn Morris
Brian Barrett wrote: [debugging advice] Thanks, I will look into this some more and try to provide a proper report (if it is not a program bug), as I should have done in the first place. I think we may have totalview around somewhere...

Re: [O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-18 Thread Glenn Morris
Don't know if this will be of help, but on further investigation the problem seems to be some code that essentially does the following: !$OMP PARALLEL DO do i=1,n do j=1,m call sub(arg1,...) end do end do !$OMP END PARALLEL DO where subroutine sub allocates a temporary array: subrout

[O-MPI users] mpirun tcsh LD_LIBRARY_PATH problem

2006-01-19 Thread Glenn Morris
Using openmpi-1.0.1. attemping to launch programs via 'mpirun --mca pls_rsh_agent ssh' fails if the user login shell is tcsh, and LD_LIBRARY_PATH is unset at startup. if ($?FOO) setenv BAR $FOO is an error in tcsh if $FOO is unset, because it expands the whole line at once. Instead one has t

Re: [O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-25 Thread Glenn Morris
I tried nightly snapshot 1.1a1r8803 and it said the following. I'm willing to try and debug this further, but would need some guidance. I have access to totalview. Signal:11 info.si_errno:0(Success) si_code:2(SEGV_ACCERR) Failing at addr:0x97421004 [0] func:/afs/slac.stanford.edu/g/ki/users/gmo

Re: [O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-26 Thread Glenn Morris
Thanks for your suggestions. Jeff Squyres wrote: > From the stack trace, it looks like you're in the middle of a > complex deallocation of some C++ objects, so I really can't tell > (i.e., not in an MPI function at all). Well, not intentionally! I'm just calling "deallocate" in a purely Fortra

Re: [O-MPI users] mpirun tcsh LD_LIBRARY_PATH problem

2006-01-30 Thread Glenn Morris
Jeff Squyres wrote: > I'll commit this to the trunk and v1.0 branch shortly; it'll be > included in v1.0.2. Thanks.

Re: [O-MPI users] Trouble combining OpenMPI and OpenMP

2006-01-30 Thread Glenn Morris
Thanks for persevering with this. I'm far from sure that the information I am providing is of much use, largely because I'm pretty confused about what's going on. Anyway... Brian Barrett wrote: > Can you rebuild Open MPI with debugging symbols (just setting CFLAGS > to -g during configure shoul

Re: [O-MPI users] mpirun tcsh LD_LIBRARY_PATH problem

2006-01-30 Thread Glenn Morris
Jeff Squyres wrote: > After sending this reply, I thought about this issue a bit more -- > do you have any idea how portable the embedding of \n's in an ssh > command is? I.e., will this work everywhere? :) I almost commented the last time "I don't know how portable this is". I would imagine it's

Re: [O-MPI users] mpirun tcsh LD_LIBRARY_PATH problem

2006-01-31 Thread Glenn Morris
Jeff Squyres wrote: > After sending this reply, I thought about this issue a bit more -- > do you have any idea how portable the embedding of \n's in an ssh > command is? I.e., will this work everywhere? On further reflection, if worried about portability, you could just reverse the order of the

[O-MPI users] tcsh 'Unmatched ".' error on localhost

2006-02-01 Thread Glenn Morris
Using v1.0.1, with tcsh as user login shell, trying to mpirun a job on the localhost that involves tcsh produces an error from tcsh. E.g. hostfile = "localhost" mpirun -np 1 --hostfile ./hostfile \ --mca pls_rsh_agent ssh ... /bin/tcsh -c hostname results in the error `Unmatched ".' from tcs

Re: [O-MPI users] mpirun tcsh LD_LIBRARY_PATH problem

2006-02-02 Thread Glenn Morris
Jeff Squyres wrote: > Excellent point. Hardly elegant, but definitely no portability > issues there -- so I like it better. Last word on this trivial issue I promise - if you don't want two copies added to L_L_P, you could use a temporary variable, e.g.: tcsh -c 'if ( "$?LD_LIBRARY_PATH" == 1

[O-MPI users] mpirun sets umask to 0

2006-02-06 Thread Glenn Morris
mpirun (v1.0.1) sets the umask to 0, and hence creates world-writable output files. Interestingly, adding the -d option to mpirun makes this problem go away. To reproduce: mpirun -np 1 --hostfile ./hostfile --mca pls_rsh_agent ssh ./a.out where a.out is compiled from: #include #include #inclu