In my previous mail I said that slot=0-3 would be a solution. Unfortunately it gives me exactly the same segfault as in case with *:*
2010/6/9 Grzegorz Maj <ma...@wp.pl>: > Hi, > I'd like mpirun to run tasks with specific ranks on specific hosts, > but I don't want to provide any particular sockets/slots/cores. > The following example uses just one host, but generally I'll use more. > In my hostfile I just have: > > root@host01 slots=4 > > I was playing with my rankfile to achieve what I've mentioned, but I > always get some problems. > > 1) With rankfile like: > rank 0=host01 slot=* > rank 1=host01 slot=* > rank 2=host01 slot=* > rank 3=host01 slot=* > > I get: > > -------------------------------------------------------------------------- > We were unable to successfully process/set the requested processor > affinity settings: > > Specified slot list: * > Error: Error > > This could mean that a non-existent processor was specified, or > that the specification had improper syntax. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > mpirun was unable to start the specified application as it encountered an > error: > > Error name: Error > Node: host01 > > when attempting to start process rank 0. > -------------------------------------------------------------------------- > [host01:13715] Rank 0: PAFFINITY cannot get physical processor id for > logical processor 4 > > > I think it tries to find processor #4, bug there are only 0-3 > > 2) With rankfile like: > rank 0=host01 slot=*:* > rank 1=host01 slot=*:* > rank 2=host01 slot=*:* > rank 3=host01 slot=*:* > > Everything looks well, i.e. my programs are spread across 4 processors. > But when running MPI program as follows: > > MPI::Init(argc, argv); > fprintf(stderr, "after init %d\n", MPI::Is_initialized()); > nprocs_mpi = MPI::COMM_WORLD.Get_size(); > fprintf(stderr, "won't get here\n"); > > I get: > > after init 1 > [host01:14348] *** Process received signal *** > [host01:14348] Signal: Segmentation fault (11) > [host01:14348] Signal code: Address not mapped (1) > [host01:14348] Failing at address: 0x8 > [host01:14348] [ 0] [0xffffe410] > [host01:14348] [ 1] p(_ZNK3MPI4Comm8Get_sizeEv+0x19) [0x8051299] > [host01:14348] [ 2] p(main+0x86) [0x804ee4e] > [host01:14348] [ 3] /lib/libc.so.6(__libc_start_main+0xe5) [0x4180b5c5] > [host01:14348] [ 4] p(__gxx_personality_v0+0x125) [0x804ecc1] > [host01:14348] *** End of error message *** > > I'm using OPEN MPI v. 1.4.2 (downloaded yesterday). > In my rankfile I really want to write something like slot=*. I know > slot=0-3 would be a solution, but when generating rankfile I may not > be sure how many processors are there available on a particular host. > > Any help would be appreciated. > > Regards, > Grzegorz Maj >