Re: [OMPI users] error with Vprotocol pessimist

2008-01-29 Thread Jeff Squyres
We are testing for a specific line when looking for this patch: test ! -z "`grep 'filename, LT_LAZY_OR_NOW' opal/libltdl/ loaders/dlopen.c`"; then If this line is different in your dlopen.c, then it doesn't find it and therefore autogen.sh doesn't patch it. Did you already patc

Re: [OMPI users] process placement with toruqe and OpenMP

2008-01-29 Thread Terry Frankcombe
> Ok so I ask the mpirun masters how would you do the following: > > I submit a job with torque (we use --with-tm) like the following: > > nodes=4:ppn=2 > > My desired outcome is to have 1 mpi process per 2 cpus and use > threaded blas (or my own OpenMP take your pick) For the above, -np 4 --bynod

[OMPI users] process placement with toruqe and OpenMP

2008-01-29 Thread Brock Palen
Ok so I ask the mpirun masters how would you do the following: I submit a job with torque (we use --with-tm) like the following: nodes=4:ppn=2 My desired outcome is to have 1 mpi process per 2 cpus and use threaded blas (or my own OpenMP take your pick) Our cluster has some 4 core machines

Re: [OMPI users] OpenMP + OMPI

2008-01-29 Thread George Bosilca
The rank 0 received SIGTERM ... I wonder how this happens. Usually, we don't just send SIGTERM around without a good reason. One of these reasons might be that we detected a segfault, but then there is some output. Is that the complete output you get from your run ? Thanks, george.

Re: [OMPI users] Question about fault tolerance checkpointing

2008-01-29 Thread Leonardo Fialho
Josh, At this moment I´m working in the uncoordinated checkpoint, and probably I´ll have some tools to collect data from the process and environment and probably from the application. About the application I´m considering the possibility to do something like this (MPI_Checkpoint??). Leonardo Fia

Re: [OMPI users] Question about fault tolerance checkpointing

2008-01-29 Thread Josh Hursey
Not at the moment. This would be a neat addition to Open MPI if application developers see a need for it. There are many issues surrounding this type of a feature (like any feature). Most of them surround what an application expects and requires from such an API. One such question is whethe

[OMPI users] Question about fault tolerance checkpointing

2008-01-29 Thread Wong, Wayne
Are there plans to provide an API that would allow a fault tolerant enabled program to invoke checkpointing directly? -Wayne

Re: [OMPI users] Trouble with fault tolerance checkpointing

2008-01-29 Thread Wong, Wayne
We have the checkpoint/restart working now. Turns out that the BLCR kernel mods were installed incorrectly. Thanks for the help. -Wayne -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Josh Hursey Sent: Monday, January 28, 2008 6:57 PM

Re: [OMPI users] OpenMP + OMPI

2008-01-29 Thread Stephen Wornom
George Bosilca wrote: Both cases should work just fine. In fact as long as there is only one execution flow using MPI functions, the user will not face any problems. I compiled my mpi fortran code using the -mp option to verify that the mpi code would still run. I get this message when I run

Re: [OMPI users] error with Vprotocol pessimist

2008-01-29 Thread Thomas Ropars
I've solved the problem by adding the flag RTLD_GLOBAL in the call to dlopen() in function "sys_dl_open (loader_data, filename)" (opal/libltdl/ltdl.c) It seems that I need this flag. However when I run autogen.sh, I get the following: ** Adjusting libltdl for OMPI :-( ++ patching for argz bu

[OMPI users] mpirun, paths and xterm again

2008-01-29 Thread jody
Hi Sorry to bring this subject up again - but i have a problem getting xterms running for all of my processes (for debugging purposes). There are actually two problem involved: display, and paths. my ssh is set up so that X forwarding is allowed, and, indeed, ssh nano_00 xterm opens an xterm fr