So what you are looking for is checkpoint/restart support, which you
can find some details about at the link below:
http://osl.iu.edu/research/ft/ompi-cr/
Additionally, we relatively recently added the ability to checkpoint
and 'stop' the application. This generates a usable checkpoint of t
This would be a very welcoming new feature for me as well. My two
thumbs up when it happens.
Best regards
Durga
On Tue, Apr 13, 2010 at 10:28 AM, Ralph Castain wrote:
> Not right now, but coming later this year...
>
> On Apr 13, 2010, at 7:21 AM, Jürgen Kaiser wrote:
>
>> Hi,
>>
>> Can I force
I believe that is called "checkpoint/restart" - see the FAQ page on that
subject.
On Apr 13, 2010, at 7:30 AM, Hoelzlwimmer Andreas - S0810595005 wrote:
> Hi,
>
> I found in the FAQ that it is possible to suspend/resume MPI jobs. Would it
> also be possible to Hibernate the jobs (free the memo
Not right now, but coming later this year...
On Apr 13, 2010, at 7:21 AM, Jürgen Kaiser wrote:
> Hi,
>
> Can I force MPI to not abort the whole job when a node crashes? I would
> like to let the remaining MPI-processes perform some action in that case
> and then proceed.
>
> Thanks,
> Jürgen
>
Ok Jeff,
i have understood. Thanks very much for your help!
Regards.
2010/4/13 Jeff Squyres
> On Apr 13, 2010, at 9:17 AM, Gabriele Fatigati wrote:
>
> > My actual configuration is:
> >
> > btl = ^tcp
> > btl_tcp_if_exclude = eth0,ib0,ib1
> > oob_tcp_include = eth1,lo
> >
> > But is it right?
On Apr 13, 2010, at 9:17 AM, Gabriele Fatigati wrote:
> My actual configuration is:
>
> btl = ^tcp
> btl_tcp_if_exclude = eth0,ib0,ib1
> oob_tcp_include = eth1,lo
>
> But is it right? I have some doubt..
It depends on what "right" is in your environment. :-)
Your default config excludes the B
Hi,
I found in the FAQ that it is possible to suspend/resume MPI jobs. Would it
also be possible to Hibernate the jobs (free the memory, serialize it to the
hard drive) and continue/wake them up later, possibly at different locations?
cheers,
Andreas
Hi,
Can I force MPI to not abort the whole job when a node crashes? I would
like to let the remaining MPI-processes perform some action in that case
and then proceed.
Thanks,
Jürgen
Yes, it's right!
Now i can see btl_tcp_if_include flag:
MCA btl: parameter "btl_tcp_if_include" (current value: , data source:
default value)
MCA btl: parameter "btl_tcp_if_exclude" (current value: "eth0,ib0,ib1", data
source: file
[/cineca/prod/opt/compilers/openmpi/1.3.3/intel--11.1--binary/et
On Apr 13, 2010, at 9:03 AM, Gabriele Fatigati wrote:
> ompi_info --param btl tcp
Ah ha... this is revealing:
> MCA btl: parameter "btl" (current value: "^tcp", data
> source: file
>
> [/cineca/prod/opt/compilers/openmpi/1.3.3/intel--11.1--binary/etc/
Ok,
this is my output:
ompi_info --param btl tcp
MCA btl: parameter "btl_base_verbose" (current value: "0", data source:
default value)
Verbosity level of the BTL framework
MCA btl: parameter "btl" (current value: "^tcp", data
source: file
[/cineca/pro
Oops! I neglected to see that you built statically -- hence, all the OMPI
plugins got slurped up into their respective libraries (e.g., libmpi.a).
If you run ompi_info --param btl tcp, do you see anything at all? If not, that
would indicate that the TCP BTL wasn't built. IF so, can you send
MM,
my OpenMPI installation haven't this library.
Ho can i do to install it? It is very important? Or i can use OpenMPI
without this module?
2010/4/13 Jeff Squyres
> Check in your installation directory under $lib/openmpi -- see if
> mca_btl_tcp.* is there. There should be a .so file (and pro
Check in your installation directory under $lib/openmpi -- see if mca_btl_tcp.*
is there. There should be a .so file (and probably a .la file as well). If
the .so is not there, then the BTL TCP plugin is not installed (which would be
darn weird, to be honest...).
On Apr 13, 2010, at 8:23 AM,
Hi Jeff,
thaks for your reply!
If i set yout command the response is empty.
This means i haven't installed TCP BTL plugin?
How can i check it?
These are my build flags:
--disable-ipv6 --disable-dlopen --enable-static --with-openib
--with-memory-manager=none --with-mpi-f90-size=medium
--with
No, that param is still there:
$ ompi_info --param btl tcp --parsable | grep clude:
mca:btl:tcp:param:btl_tcp_if_include:value:
mca:btl:tcp:param:btl_tcp_if_include:data_source:default value
mca:btl:tcp:param:btl_tcp_if_include:status:writable
mca:btl:tcp:param:btl_tcp_if_include:help:Comma-delimi
Dear OpenMPI users and developers,
I'm trying OpenMPI 1.3.3 and i've noted that btl_tcp_if_exclude is not
supported from new version:
the response to this command:
ompi_info --param all all | grep btl_tcp_if_exclude
is empty.
Maybe that params is renamed?
Thanks in advance
--
Ing. Gabriel
17 matches
Mail list logo