[OMPI users] engineer position on hwloc+netloc

2014-10-30 Thread Brice Goglin
Hello,

There's an R&D engineer position opening in my research team at Inria
Bordeaux (France) for developing hwloc and netloc software (both Open
MPI subprojects).

All details available at

http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.en.pdf
or French version

http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.fr.pdf

Expected candidates should have received a master in computer science
less than 5 years ago. If you have any question, please contact me.

Brice




Re: [OMPI users] Allgather in OpenMPI 1.4.3

2014-10-30 Thread Sebastian Rettenberger
Since I can't upgrade the system packages anyway (due to dependencies), 
I installed version 1.8.3. The bug is fixed in this version.


Thank you
Sebastian

On 29.10.2014 16:03, Jeff Squyres (jsquyres) wrote:

Can you at least upgrade to 1.4.5?  That's the last release in the 1.4.x series.

Note that you can always install Open MPI as a normal/non-root user (e.g., 
install it into your $HOME, or some such).


On Oct 28, 2014, at 12:08 PM, Sebastian Rettenberger  wrote:


Hi,

I know 1.4.3 is really old but I am currently stuck with it. However, there 
seems to be a bug in Allgather.

I have attached the source of an example program.

The output I would expect is:

rettenbs@hpcsccs4:/tmp$ mpiexec -np 5 ./a.out
0 0 1 2
1 0 1 2
2 0 1 2
3 0 1 2
4 0 1 2


But what I get is different results when I run the program multiple times:

rettenbs@hpcsccs4:/tmp$ mpiexec -np 5 ./a.out
0 0 1 2
1 0 1 2
2 0 1 2
3 2000 2001 2002
4 0 1 2
rettenbs@hpcsccs4:/tmp$ mpiexec -np 5 ./a.out
0 0 1 2
1 0 1 2
2 0 1 2
3 2000 2001 2002
4 3000 3001 3002


This bug is probably already fixed. Does anybody know in which version?

Best regards,
Sebastian

--
Sebastian Rettenberger, M.Sc.
Technische Universität München
Department of Informatics
Chair of Scientific Computing
Boltzmannstrasse 3, 85748 Garching, Germany
http://www5.in.tum.de/
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/10/25633.php





--
Sebastian Rettenberger, M.Sc.
Technische Universität München
Department of Informatics
Chair of Scientific Computing
Boltzmannstrasse 3, 85748 Garching, Germany
http://www5.in.tum.de/



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Nathan Hjelm
I want to close the loop on this issue. 1.8.5 will address it in several
ways:

 - knem support in btl/sm has been fixed. A sanity check was disabling
   knem during component registration. I wrote the sanity check before
   the 1.7 release and didn't intend this side-effect.

 - vader now supports xpmem, cma, and knem. The best available
   single-copy mechanism will be used. If multiple single-copy
   mechanisms are available you can select which one you want to use are
   runtime.

More about the vader btl can be found here:
http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy/

-Nathan Hjelm
HPC-5, LANL

On Fri, Oct 17, 2014 at 01:02:23PM -0700, Ralph Castain wrote:
>  On Oct 17, 2014, at 12:06 PM, Gus Correa  wrote:
>  Hi Jeff
> 
>  Many thanks for looking into this and filing a bug report at 11:16PM!
> 
>  Thanks to Aurelien, Ralph and Nathan for their help and clarifications
>  also.
> 
>  **
> 
>  Related suggestion:
> 
>  Add a note to the FAQ explaining that in OMPI 1.8
>  the new (default) btl is vader (and what it is).
> 
>  It was a real surprise to me.
>  If Aurelien Bouteiller didn't tell me about vader,
>  I might have never realized it even existed.
> 
>  That could be part of one of the already existent FAQs
>  explaining how to select the btl.
> 
>  **
> 
>  Doubts (btl in OMPI 1.8):
> 
>  I still don't understand clearly the meaning and scope of vader
>  being a "default btl".
> 
>We mean that it has a higher priority than the other shared memory
>implementation, and so it will be used for intra-node messaging by
>default.
> 
>  Which is the scope of this default: intra-node btl only perhaps?
> 
>Yes - strictly intra-node
> 
>  Was there a default btl before vader, and which?
> 
>The "sm" btl was the default shared memory transport before vader
> 
>  Is vader the intra-node default only (i.e. replaces sm  by default),
> 
>Yes
> 
>  or does it somehow extend beyond node boundaries, and replaces (or
>  brings in) network btls (openib,tcp,etc) ?
> 
>Nope - just intra-node
> 
>  If I am running on several nodes, and want to use openib, not tcp,
>  and, say, use vader, what is the right syntax?
> 
>  * nothing (OMPI will figure it out ... but what if you have
>  IB,Ethernet,Myrinet,OpenGM, altogether?)
> 
>If you have higher-speed connections, we will pick the fastest for
>inter-node messaging as the "default" since we expect you would want the
>fastest possible transport.
> 
>  * -mca btl openib (and vader will come along automatically)
> 
>Among the ones you show, this would indeed be the likely choices (openib
>and vader)
> 
>  * -mca btl openib,self (and vader will come along automatically)
> 
>The "self" btl is *always* active as the loopback transport
> 
>  * -mca btl openib,self,vader (because vader is default only for 1-node
>  jobs)
>  * something else (or several alternatives)
> 
>  Whatever happened to the "self" btl in this new context?
>  Gone? Still there?
> 
>  Many thanks,
>  Gus Correa
> 
>  On 10/16/2014 11:16 PM, Jeff Squyres (jsquyres) wrote:
> 
>On Oct 16, 2014, at 1:35 PM, Gus Correa  wrote:
> 
>  and on the MCA parameter file:
> 
>  btl_sm_use_knem = 1
> 
>I think the logic enforcing this MCA param got broken when we revamped
>the MCA param system.  :-(
> 
>  I am scratching my head to understand why a parameter with such a
>  suggestive name ("btl_sm_have_knem_support"),
>  so similar to the OMPI_BTL_SM_HAVE_KNEM cpp macro,
>  somehow vanished from ompi_info in OMPI 1.8.3.
> 
>It looks like this MCA param was also dropped when we revamped the MCA
>system.  Doh!  :-(
> 
>There's some deep mojo going on that is somehow causing knem to not be
>used; I'm too tired to understand the logic right now.  I just opened
>https://github.com/open-mpi/ompi/issues/239 to track this issue --
>feel free to subscribe to the issue to get updates.
> 
>  ___
>  users mailing list
>  us...@open-mpi.org
>  Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>  Link to this
>  post: http://www.open-mpi.org/community/lists/users/2014/10/25532.php

> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25534.php



pgp9iM_PC5QYR.pgp
Description: PGP signature


[OMPI users] orte-ps and orte-top behavior

2014-10-30 Thread Brock Palen
If i'm on the node hosting mpirun for a job, and run:

orte-ps

It finds the job and shows the pids and info for all ranks.

If I use orte-top though it does no such default, I have to find the mpirun pid 
and then use it.

Why do the two have different behavior?  The show data from the same source 
don't they?

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985





Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa

Hi Nathan

Thank you very much for addressing this problem.

I read your notes on Jeff's blog about vader,
and that clarified many things that were obscure to me
when I first started this thread
whining that knem was not working in OMPI 1.8.3.
Thank you also for writing that blog post,
and for sending the link to it.
That was very helpful indeed.

As your closing comments on the blog post point out,
and your IMB benchmark graphs of pingpong/latency &
sendrecv/bandwidth show,
vader+xpmem outperforms the other combinations
of btl+memory_copy_mechanism of intra-node communication.

For the benefit of pedestrian OpenMPI users like me:

1) What is the status of xpmem in the Linux world at this point?
[Proprietary (SGI?) / open source, part of the Linux kernel (which),
part of standard distributions (which) ?]

2) Any recommendation for the values of the
various vader btl parameters?
[There are 12 of them in OMPI 1.8.3!
That is real challenge to get right.]

Which values did you use in your benchmarks?
Defaults?
Other?

In particular, is there an optimal value for the eager/rendevous 
threshold value? (btl_vader_eager_limit, default=4kB)
[The INRIA web site suggests 32kB for the sm+knem counterpart 
(btl_sm_eager_limit, default=4kB).]


3) Did I understand it right, that the upcoming OpenMPI 1.8.5
can be configured with more than one memory copy mechanism altogether
(e.g. --with-knem and --with-cma and --with-xpmem),
then select one of them at runtime with the 
btl_vader_single_copy_mechanism parameter?

Or must OMPI be configured with only one memory copy mechanism?

Many thanks,
Gus Correa


On 10/30/2014 05:44 PM, Nathan Hjelm wrote:

I want to close the loop on this issue. 1.8.5 will address it in several
ways:

  - knem support in btl/sm has been fixed. A sanity check was disabling
knem during component registration. I wrote the sanity check before
the 1.7 release and didn't intend this side-effect.

  - vader now supports xpmem, cma, and knem. The best available
single-copy mechanism will be used. If multiple single-copy
mechanisms are available you can select which one you want to use are
runtime.

More about the vader btl can be found here:
http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy/

-Nathan Hjelm
HPC-5, LANL

On Fri, Oct 17, 2014 at 01:02:23PM -0700, Ralph Castain wrote:

  On Oct 17, 2014, at 12:06 PM, Gus Correa  wrote:
  Hi Jeff

  Many thanks for looking into this and filing a bug report at 11:16PM!

  Thanks to Aurelien, Ralph and Nathan for their help and clarifications
  also.

  **

  Related suggestion:

  Add a note to the FAQ explaining that in OMPI 1.8
  the new (default) btl is vader (and what it is).

  It was a real surprise to me.
  If Aurelien Bouteiller didn't tell me about vader,
  I might have never realized it even existed.

  That could be part of one of the already existent FAQs
  explaining how to select the btl.

  **

  Doubts (btl in OMPI 1.8):

  I still don't understand clearly the meaning and scope of vader
  being a "default btl".

We mean that it has a higher priority than the other shared memory
implementation, and so it will be used for intra-node messaging by
default.

  Which is the scope of this default: intra-node btl only perhaps?

Yes - strictly intra-node

  Was there a default btl before vader, and which?

The "sm" btl was the default shared memory transport before vader

  Is vader the intra-node default only (i.e. replaces sm  by default),

Yes

  or does it somehow extend beyond node boundaries, and replaces (or
  brings in) network btls (openib,tcp,etc) ?

Nope - just intra-node

  If I am running on several nodes, and want to use openib, not tcp,
  and, say, use vader, what is the right syntax?

  * nothing (OMPI will figure it out ... but what if you have
  IB,Ethernet,Myrinet,OpenGM, altogether?)

If you have higher-speed connections, we will pick the fastest for
inter-node messaging as the "default" since we expect you would want the
fastest possible transport.

  * -mca btl openib (and vader will come along automatically)

Among the ones you show, this would indeed be the likely choices (openib
and vader)

  * -mca btl openib,self (and vader will come along automatically)

The "self" btl is *always* active as the loopback transport

  * -mca btl openib,self,vader (because vader is default only for 1-node
  jobs)
  * something else (or several alternatives)

  Whatever happened to the "self" btl in this new context?
  Gone? Still there?

  Many thanks,
  Gus Correa

  On 10/16/2014 11:16 PM, Jeff Squyres (jsquyres) wrote:

On Oct 16, 2014, at 1:35 PM, Gus Correa  wrote:

  and on the MCA parameter file:

  btl_sm_use_knem = 1

I th

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Ralph Castain
Just for FYI: I believe Nathan misspoke. The new capability is in 1.8.4, which 
I hope to release next Friday (Nov 7th)

> On Oct 30, 2014, at 4:24 PM, Gus Correa  wrote:
> 
> Hi Nathan
> 
> Thank you very much for addressing this problem.
> 
> I read your notes on Jeff's blog about vader,
> and that clarified many things that were obscure to me
> when I first started this thread
> whining that knem was not working in OMPI 1.8.3.
> Thank you also for writing that blog post,
> and for sending the link to it.
> That was very helpful indeed.
> 
> As your closing comments on the blog post point out,
> and your IMB benchmark graphs of pingpong/latency &
> sendrecv/bandwidth show,
> vader+xpmem outperforms the other combinations
> of btl+memory_copy_mechanism of intra-node communication.
> 
> For the benefit of pedestrian OpenMPI users like me:
> 
> 1) What is the status of xpmem in the Linux world at this point?
> [Proprietary (SGI?) / open source, part of the Linux kernel (which),
> part of standard distributions (which) ?]
> 
> 2) Any recommendation for the values of the
> various vader btl parameters?
> [There are 12 of them in OMPI 1.8.3!
> That is real challenge to get right.]
> 
> Which values did you use in your benchmarks?
> Defaults?
> Other?
> 
> In particular, is there an optimal value for the eager/rendevous threshold 
> value? (btl_vader_eager_limit, default=4kB)
> [The INRIA web site suggests 32kB for the sm+knem counterpart 
> (btl_sm_eager_limit, default=4kB).]
> 
> 3) Did I understand it right, that the upcoming OpenMPI 1.8.5
> can be configured with more than one memory copy mechanism altogether
> (e.g. --with-knem and --with-cma and --with-xpmem),
> then select one of them at runtime with the btl_vader_single_copy_mechanism 
> parameter?
> Or must OMPI be configured with only one memory copy mechanism?
> 
> Many thanks,
> Gus Correa
> 
> 
> On 10/30/2014 05:44 PM, Nathan Hjelm wrote:
>> I want to close the loop on this issue. 1.8.5 will address it in several
>> ways:
>> 
>>  - knem support in btl/sm has been fixed. A sanity check was disabling
>>knem during component registration. I wrote the sanity check before
>>the 1.7 release and didn't intend this side-effect.
>> 
>>  - vader now supports xpmem, cma, and knem. The best available
>>single-copy mechanism will be used. If multiple single-copy
>>mechanisms are available you can select which one you want to use are
>>runtime.
>> 
>> More about the vader btl can be found here:
>> http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy/
>> 
>> -Nathan Hjelm
>> HPC-5, LANL
>> 
>> On Fri, Oct 17, 2014 at 01:02:23PM -0700, Ralph Castain wrote:
>>>  On Oct 17, 2014, at 12:06 PM, Gus Correa  
>>> wrote:
>>>  Hi Jeff
>>> 
>>>  Many thanks for looking into this and filing a bug report at 11:16PM!
>>> 
>>>  Thanks to Aurelien, Ralph and Nathan for their help and clarifications
>>>  also.
>>> 
>>>  **
>>> 
>>>  Related suggestion:
>>> 
>>>  Add a note to the FAQ explaining that in OMPI 1.8
>>>  the new (default) btl is vader (and what it is).
>>> 
>>>  It was a real surprise to me.
>>>  If Aurelien Bouteiller didn't tell me about vader,
>>>  I might have never realized it even existed.
>>> 
>>>  That could be part of one of the already existent FAQs
>>>  explaining how to select the btl.
>>> 
>>>  **
>>> 
>>>  Doubts (btl in OMPI 1.8):
>>> 
>>>  I still don't understand clearly the meaning and scope of vader
>>>  being a "default btl".
>>> 
>>>We mean that it has a higher priority than the other shared memory
>>>implementation, and so it will be used for intra-node messaging by
>>>default.
>>> 
>>>  Which is the scope of this default: intra-node btl only perhaps?
>>> 
>>>Yes - strictly intra-node
>>> 
>>>  Was there a default btl before vader, and which?
>>> 
>>>The "sm" btl was the default shared memory transport before vader
>>> 
>>>  Is vader the intra-node default only (i.e. replaces sm  by default),
>>> 
>>>Yes
>>> 
>>>  or does it somehow extend beyond node boundaries, and replaces (or
>>>  brings in) network btls (openib,tcp,etc) ?
>>> 
>>>Nope - just intra-node
>>> 
>>>  If I am running on several nodes, and want to use openib, not tcp,
>>>  and, say, use vader, what is the right syntax?
>>> 
>>>  * nothing (OMPI will figure it out ... but what if you have
>>>  IB,Ethernet,Myrinet,OpenGM, altogether?)
>>> 
>>>If you have higher-speed connections, we will pick the fastest for
>>>inter-node messaging as the "default" since we expect you would want the
>>>fastest possible transport.
>>> 
>>>  * -mca btl openib (and vader will come along automatically)
>>> 
>>>Among the ones you show, this would indeed be the likely choices (openib
>>>and vader)
>>> 
>>>  * -mca btl openib,self (and vader will come along aut

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa

On 10/30/2014 07:32 PM, Ralph Castain wrote:

Just for FYI: I believe Nathan misspoke.
The new capability is in 1.8.4, which I hope
to release next Friday (Nov 7th)



Hi Ralph

That is even better!
Look forward to OMPI 1.8.4.

I still would love to hear from Nathan / OMPI team
about my remaining questions below
(specially the 12 vader parameters).

Many thanks,
Gus Correa


On Oct 30, 2014, at 4:24 PM, Gus Correa  wrote:

Hi Nathan

Thank you very much for addressing this problem.

I read your notes on Jeff's blog about vader,
and that clarified many things that were obscure to me
when I first started this thread
whining that knem was not working in OMPI 1.8.3.
Thank you also for writing that blog post,
and for sending the link to it.
That was very helpful indeed.

As your closing comments on the blog post point out,
and your IMB benchmark graphs of pingpong/latency &
sendrecv/bandwidth show,
vader+xpmem outperforms the other combinations
of btl+memory_copy_mechanism of intra-node communication.

For the benefit of pedestrian OpenMPI users like me:

1) What is the status of xpmem in the Linux world at this point?
[Proprietary (SGI?) / open source, part of the Linux kernel (which),
part of standard distributions (which) ?]

2) Any recommendation for the values of the
various vader btl parameters?
[There are 12 of them in OMPI 1.8.3!
That is real challenge to get right.]

Which values did you use in your benchmarks?
Defaults?
Other?

In particular, is there an optimal value for the eager/rendevous threshold 
value? (btl_vader_eager_limit, default=4kB)
[The INRIA web site suggests 32kB for the sm+knem counterpart 
(btl_sm_eager_limit, default=4kB).]

3) Did I understand it right, that the upcoming OpenMPI 1.8.5
can be configured with more than one memory copy mechanism altogether
(e.g. --with-knem and --with-cma and --with-xpmem),
then select one of them at runtime with the btl_vader_single_copy_mechanism 
parameter?
Or must OMPI be configured with only one memory copy mechanism?

Many thanks,
Gus Correa


On 10/30/2014 05:44 PM, Nathan Hjelm wrote:

I want to close the loop on this issue. 1.8.5 will address it in several
ways:

  - knem support in btl/sm has been fixed. A sanity check was disabling
knem during component registration. I wrote the sanity check before
the 1.7 release and didn't intend this side-effect.

  - vader now supports xpmem, cma, and knem. The best available
single-copy mechanism will be used. If multiple single-copy
mechanisms are available you can select which one you want to use are
runtime.

More about the vader btl can be found here:
http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy/

-Nathan Hjelm
HPC-5, LANL

On Fri, Oct 17, 2014 at 01:02:23PM -0700, Ralph Castain wrote:

  On Oct 17, 2014, at 12:06 PM, Gus Correa  wrote:
  Hi Jeff

  Many thanks for looking into this and filing a bug report at 11:16PM!

  Thanks to Aurelien, Ralph and Nathan for their help and clarifications
  also.

  **

  Related suggestion:

  Add a note to the FAQ explaining that in OMPI 1.8
  the new (default) btl is vader (and what it is).

  It was a real surprise to me.
  If Aurelien Bouteiller didn't tell me about vader,
  I might have never realized it even existed.

  That could be part of one of the already existent FAQs
  explaining how to select the btl.

  **

  Doubts (btl in OMPI 1.8):

  I still don't understand clearly the meaning and scope of vader
  being a "default btl".

We mean that it has a higher priority than the other shared memory
implementation, and so it will be used for intra-node messaging by
default.

  Which is the scope of this default: intra-node btl only perhaps?

Yes - strictly intra-node

  Was there a default btl before vader, and which?

The "sm" btl was the default shared memory transport before vader

  Is vader the intra-node default only (i.e. replaces sm  by default),

Yes

  or does it somehow extend beyond node boundaries, and replaces (or
  brings in) network btls (openib,tcp,etc) ?

Nope - just intra-node

  If I am running on several nodes, and want to use openib, not tcp,
  and, say, use vader, what is the right syntax?

  * nothing (OMPI will figure it out ... but what if you have
  IB,Ethernet,Myrinet,OpenGM, altogether?)

If you have higher-speed connections, we will pick the fastest for
inter-node messaging as the "default" since we expect you would want the
fastest possible transport.

  * -mca btl openib (and vader will come along automatically)

Among the ones you show, this would indeed be the likely choices (openib
and vader)

  * -mca btl openib,self (and vader will come along automatically)

The "self" btl is *always* active as the loopback transport

  * -mca btl openib,self,vader (because 

Re: [OMPI users] large memory usage and hangs when preconnecting beyond 1000 cpus

2014-10-30 Thread Marshall Ward
Hi, I'm just following up on this to say that the problem was not
related to preconnection, but just very large memory usage for high
CPU jobs.

Preconnecting was just acting to send off a large number of
isend/irecv messages and trigger the memory consumption.

I tried experimenting a bit with XRC, mostly just by copying the
values specified here in the faq:

http://www.open-mpi.org/faq/?category=openfabrics#ib-receive-queues

but it seems that I brought down some nodes in the process!

Is this the right way to reduce my memory consumption per node? Is
there some other way to go about it? (Or a safe way that doesn't cause
kernel panics? :) )

On Wed, Oct 22, 2014 at 1:40 AM, Nathan Hjelm  wrote:
>
> At those sizes it is possible you are running into resource
> exhastion issues. Some of the resource exhaustion code paths still lead
> to hangs. If the code does not need to be fully connected I would
> suggest not using mpi_preconnect_mpi but instead track down why the
> initial MPI_Allreduce hangs. I would suggest the stack trace analysis
> tool (STAT). I might help you narrow down where the problem is
> occuring.
>
> -Nathan Hjelm
> HPC-5, LANL
>
> On Tue, Oct 21, 2014 at 01:12:21PM +1100, Marshall Ward wrote:
>> Thanks, it's at least good to know that the behaviour isn't normal!
>>
>> Could it be some sort of memory leak in the call? The code in
>>
>> ompi/runtime/ompi_mpi_preconnect.c
>>
>> looks reasonably safe, though maybe doing thousands of of isend/irecv
>> pairs is causing problems with the buffer used in ptp messages?
>>
>> I'm trying to see if valgrind can see anything, but nothing from
>> ompi_init_preconnect_mpi is coming up (although there are some other
>> warnings).
>>
>>
>> On Sun, Oct 19, 2014 at 2:37 AM, Ralph Castain  wrote:
>> >
>> >> On Oct 17, 2014, at 3:37 AM, Marshall Ward  
>> >> wrote:
>> >>
>> >> I currently have a numerical model that, for reasons unknown, requires
>> >> preconnection to avoid hanging on an initial MPI_Allreduce call.
>> >
>> > That is indeed odd - it might take a while for all the connections to 
>> > form, but it shouldn’t hang
>> >
>> >> But
>> >> when we try to scale out beyond around 1000 cores, we are unable to
>> >> get past MPI_Init's preconnection phase.
>> >>
>> >> To test this, I have a basic C program containing only MPI_Init() and
>> >> MPI_Finalize() named `mpi_init`, which I compile and run using `mpirun
>> >> -mca mpi_preconnect_mpi 1 mpi_init`.
>> >
>> > I doubt preconnect has been tested in a rather long time as I’m unaware of 
>> > anyone still using it (we originally provided it for some legacy code that 
>> > otherwise took a long time to initialize). However, I could give it a try 
>> > and see what happens. FWIW: because it was so targeted and hasn’t been 
>> > used in a long time, the preconnect algo is really not very efficient. 
>> > Still, shouldn’t have anything to do with memory footprint.
>> >
>> >>
>> >> This preconnection seems to consume a large amount of memory, and is
>> >> exceeding the available memory on our nodes (~2GiB/core) as the number
>> >> gets into the thousands (~4000 or so). If we try to preconnect to
>> >> around ~6000, we start to see hangs and crashes.
>> >>
>> >> A failed 5600 core preconnection gave this warning (~10k times) while
>> >> hanging for 30 minutes:
>> >>
>> >>[warn] opal_libevent2021_event_base_loop: reentrant invocation.
>> >> Only one event_base_loop can run on each event_base at once.
>> >>
>> >> A failed 6000-core preconnection job crashed almost immediately with
>> >> the following error.
>> >>
>> >>[r104:18459] [[32743,0],0] ORTE_ERROR_LOG: File open failure in
>> >> file ras_tm_module.c at line 159
>> >>[r104:18459] [[32743,0],0] ORTE_ERROR_LOG: File open failure in
>> >> file ras_tm_module.c at line 85
>> >>[r104:18459] [[32743,0],0] ORTE_ERROR_LOG: File open failure in
>> >> file base/ras_base_allocate.c at line 187
>> >
>> > This doesn’t have anything to do with preconnect - it indicates that 
>> > mpirun was unable to open the Torque allocation file. However, it 
>> > shouldn’t have “crashed”, but instead simply exited with an error message.
>> >
>> >>
>> >> Should we expect to use very large amounts of memory for
>> >> preconnections of thousands of CPUs? And can these
>> >>
>> >> I am using Open MPI 1.8.2 on Linux 2.6.32 (centOS) and FDR infiniband
>> >> network. This is probably not enough information, but I'll try to
>> >> provide more if necessary. My knowledge of implementation is
>> >> unfortunately very limited.
>> >> ___
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >> Link to this post: 
>> >> http://www.open-mpi.org/community/lists/users/2014/10/25527.php
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link