Interesting. So to recap, just so that I understand, was the problem that Torque was restricting the stack size of your application processes (which apparently was only problematic in 64 bit mode applications)?

On Aug 31, 2009, at 9:42 PM, Sims, James S. Dr. wrote:

After much more work on this problem, and isolating it better, I finally found a torque user who recognized the problem and supplied the solution. Thanks to everyone on this list who responded to my request for help. Here is my revised statement
of the problem and the solution:

On Fri, Aug 28, 2009 at 12:37 PM, Sims, James S. Dr.<james.s...@nist.gov > wrote: > I have a working 32 bit MPI code which works with either lam or openmpi. However > I have not been able to run this code in 64 bit mode. In attempting to isolate the > problem, I have replaced the MPI code with stubs so I can run it using mpirun -np 1 program > on a single processor. The code works fine as long as I don't run it within torque, and
> dies with a segv early in the code if I run it within torque.

jim,

the pbs_mom inherits the limitations from the superuser at bootup.

we had similar problems and just put:

# max locked memory, soft and hard limits for all PBS children
ulimit -H -l unlimited
ulimit -S -l 4096000
# stack size, soft and hard limits for all PBS children
ulimit -H -s unlimited
ulimit -S -s 1024000

the following into /etc/rc.d/init.d/pbs_mom

and had no more problems.

cheers,
    axel.


________________________________________
From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of Ralph Castain [r...@open-mpi.org]
Sent: Friday, July 24, 2009 7:27 AM
To: Open MPI Users
Subject: Re: [OMPI users] Open MPI:Problem with 64-bit openMPI andintel compiler

Good point.

Other thing you might consider (though it is unlikely to be part of
this problem) is upgrading to 1.3.3. It probably isn't a good idea to
be using a release candidate for anything serious.


On Jul 24, 2009, at 5:21 AM, Jeff Squyres wrote:

> On Jul 23, 2009, at 11:14 PM, Ralph Castain wrote:
>
>> 3. get a multi-node allocation and run "pbsdsh echo $LD_LIBRARY_PATH"
>> and see what libs you are defaulting to on the other nodes.
>>
>
>
> Be careful with this one; you want to ensure that your local shell
> doesn't expand $LD_LIBRARY_PATH and simply display the same value on
> all nodes. It might be easiest to write a 2 line script and run that:
>
> $ cat myscript
> #!/bin/sh
> echo LD_LIB_PATH on `hostname` is: $LD_LIBRARY_PATH
> $ chmod +x myscript
> $ pdsh myscript
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
jsquy...@cisco.com

Reply via email to