Could you add --display-allocation to your cmd line? This will tell us if it 
found/read the default hostfile, or if the problem is with the mapper.


On Feb 1, 2012, at 7:58 AM, Reuti wrote:

> Am 01.02.2012 um 15:38 schrieb Ralph Castain:
> 
>> On Feb 1, 2012, at 3:49 AM, Reuti wrote:
>> 
>>> Am 31.01.2012 um 21:25 schrieb Ralph Castain:
>>> 
>>>> On Jan 31, 2012, at 12:58 PM, Reuti wrote:
>>> 
>>> BTW: is there any default for a hostfile for Open MPI - I mean any in my 
>>> home directory or /etc? When I check `man orte_hosts`, and all possible 
>>> optiions are unset (like in a singleton run), it will only run local (Job 
>>> is co-located with mpirun).
>> 
>> Yep - it is <prefix>/etc/openmpi-default-hostfile
> 
> Thx for replying Ralph.
> 
> I spotted it too, but this is not working for me. Neither for mpiexec from 
> the command line, nor any singleton. I also tried a plain /etc as location of 
> this file as well.
> 
> reuti@pc15370:~> which mpicc
> /home/reuti/local/openmpi-1.4.4-thread/bin/mpicc
> reuti@pc15370:~> cat 
> /home/reuti/local/openmpi-1.4.4-thread/etc/openmpi-default-hostfile
> pc15370 slots=2
> pc15381 slots=2
> reuti@pc15370:~> mpicc -o mpihello mpihello.c
> reuti@pc15370:~> mpiexec -np 4 ./mpihello
> Hello World from Node 0.
> Hello World from Node 1.
> Hello World from Node 2.
> Hello World from Node 3.
> 
> But all is local (no spawn here, traditional mpihello):
> 
> 19503 ?        Ss     0:00 /usr/sbin/sshd -o PidFile=/var/run/sshd.init.pid
> 11583 ?        Ss     0:00  \_ sshd: reuti [priv]                             
>     
> 11585 ?        S      0:00  |   \_ sshd: reuti@pts/6                          
>         
> 11587 pts/6    Ss     0:00  |       \_ -bash
> 13470 pts/6    S+     0:00  |           \_ mpiexec -np 4 ./mpihello
> 13471 pts/6    R+     0:00  |               \_ ./mpihello
> 13472 pts/6    R+     0:00  |               \_ ./mpihello
> 13473 pts/6    R+     0:00  |               \_ ./mpihello
> 13474 pts/6    R+     0:00  |               \_ ./mpihello
> 
> -- Reuti
> 
> 
>>>> We probably aren't correctly marking the original singleton on that node, 
>>>> and so the mapper thinks there are still two slots available on the 
>>>> original node.
>>> 
>>> Okay. There is something to discuss/fix. BTW: if started as singleton I get 
>>> an error at the end with the program the OP provided:
>>> 
>>> [pc15381:25502] [[12435,0],1] routed:binomial: Connection to lifeline 
>>> [[12435,0],0] lost
>> 
>> Okay, I'll take a look at it - but it may take awhile before I can address 
>> either issue as other priorities loom.
>> 
>>> 
>>> It's not the case if run by mpiexec.
>>> 
>>> -- Reuti
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to