I don't see anything in the code that limits the number of procs in a rankfile. 
Are the attached rankfiles the ones you are trying to use? I'm wondering if 
there is a syntax error that is causing the problem. It would help if you could 
provide the complete error message output.

At one time, there was a limit on the number of procs on a node - nothing to do 
with rankfile. That was fixed, though, and there is no real limit any more. I 
don't recall the precise release number where it changed in the 1.5 series - 
you might try updating to 1.5.4 as I'm sure it doesn't exist there.


On Jan 20, 2012, at 12:43 PM, Paul Kapinos wrote:

> Hello, Open MPI developer!
> 
> Now, we have a really nice toy: 2 Tb RAM, 16 sockets, 128 cores.
> (4x smaller Bull S6010 coupled by BCS chips to a single image machine)
> 
> On a such big box, process pinning is vital.
> 
> So we tried to use the Open MPI capabilities to pin te processes. But it seem 
> that the rankfile infrastructure does not work properly: we always get 
> "Error: Invalid argument" message on the 128-core node, also if the rankfile 
> was OK.
> On a smaller node (up to 32 cores/ 64 threads) the very same rankfile (with 
> changed node name of course) works well.
> 
> I believe, this computer dimension is a bit too big for the pinning 
> infrasructure now. A bug?
> 
> Best wishes,
> 
> Paul Kapinos
> 
> P.S. see the attached .tgz for some logzz
> 
> ------------------------------------------------------------------------------
>   Rankfiles
>       Rankfiles provide a means for specifying detailed information about how 
> process ranks should  be  mapped  to nodes and how they should be bound.  
> Consider the following:
> ....
> ------------------------------------------------------------------------------
>                Open RTE: 1.5.3
>   Open RTE SVN revision: r24532
>   Open RTE release date: Mar 16, 2011
>                    OPAL: 1.5.3
>       OPAL SVN revision: r24532
>       OPAL release date: Mar 16, 2011
>            Ident string: 1.5.3
> 
> 
> 
> -- 
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, Center for Computing and Communication
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
> <rankfiles128.tgz>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to