On 2/6/12 8:14 AM, "Reuti" <re...@staff.uni-marburg.de> wrote:
>> If I need MPI_THREAD_MULTIPLE, and openmpi is compiled with thread support, >> it's not clear to me whether MPI::Init_Thread() and >> MPI::Inint_Thread(MPI::THREAD_MULTIPLE) would give me the same behavior from >> Open MPI. > > If you need thread support, you will need MPI::Init_Thread and it needs one > argument (or three). Sorry, typo on my side. I meant to compare MPI::Init_thread(MPI::THREAD_MULTIPLE) and MPI::Init(). I think that your first reply mentioned replacing MPI::Init_thread by MPI::Init. > I suggest to use a stable version 1.4.4 for your experiments. As you said you > are new MPI, you could get misled between wrong error messages and bugs and > error messages due to a programming error on your side. OK. I'll certainly set it up so that I can validate what's supposed to work. I'll have to check with our main MPI developers to see whether there's anything in 1.5.x that they need. >> 1. I'm still surprised that the SGE behavior is so different when I >> configure my SGE queue differently. See test "a" in the .tgz. When I just >> run mpitest in mpi.sh and ask for exactly 5 slots (-pe orte 5-5), it works >> if the queue is configured to use a single host. I see 1 MASTER and 4 >> SLAVES in qstat -g t, and I get the correct output. > > Fine. ("job_is_first_task true" in the PE according to this.) Yes, I believe that job_is_first_task will need to be true for our environment. >> If the queue is set to >> use multiple hosts, the jobs hang in spawn/init, and I get errors >> [grid-03.cisco.com][[19159,2],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint >> _complete_connect] connect() to 192.168.122.1 failed: Connection refused >> (111) > > What is the setting in SGE for: > > $ qconf -sconf > ... > qlogin_command builtin > qlogin_daemon builtin > rlogin_command builtin > rlogin_daemon builtin > rsh_command builtin > rsh_daemon builtin > If it's set to use ssh, Nope. My output is the same as yours. qlogin_command builtin qlogin_daemon builtin rlogin_command builtin rlogin_daemon builtin rsh_command builtin rsh_daemon builtin > But I wonder, why it's working for some nodes? I don't think that it's working on some nodes. In my other cases where it hangs, I don't always get those "connection refused" errors. I'm not sure, but the "connection refused" errors might be a red herring. The machines' primary NICs are on a different private network (172.28.*.*). The 192.168.122.1 address is actually the machine's own virbr0 device, which the documentations says is a "xen interface used by Virtualization guest and host oses for network communication." > Are there custom configuration per node, and some are faulty: I did a qconf -sconf machine for each host in my grid. I get identical output like this for each machine. $ qconf -sconf grid-03 #grid-03.cisco.com: mailer /bin/mail xterm /usr/bin/xterm So, I think that the SGE config is the same across those machines. >> 2. I guess I'm not sure how SGE is supposed to behave. Experiment "a" and >> "b" were identical except that I changed -pe orte 5-5 to -pe orte 5-. The >> single case works like before, and the multiple exec host case fails as >> before. The difference is that qstat -g t shows additional SLAVEs that >> don't seem to correspond to any jobs on the exec hosts. Are these SLAVEs >> just slots that are reserved for my job but that I'm not using? If my job >> will only use 5 slots, then I should set the SGE qsub job to ask for exactly >> 5 with "-pe orte 5-5", right? > > Correct. The remaining ones are just unused. You could adjust your application > of course to check how many slots were granted, and start slaves according to > the information you got to use all granted slots. OK. That makes sense. In our intended uses, I believe that we'll know exactly how many slots the application will need, and it will use the same number of slots throughout the entire job. >> 3. Experiment "d" was similar to "b", but I use mpi.sh uses "mpiexec -np 1 >> mpitest" instead of running mpitest directly. Now both the single machine >> queue and multiple machine queue work. So, mpiexec seems to make my >> multi-machine configuration happier. In this case, I'm still using "-pe >> orte 5-", and I'm still seeing the extra SLAVE slots granted in qstat -g t. > > Then case a) could show a bug in 1.5.4. For me both we working, but the OK. That helps to explain my confusion. Our previous experiments (where I was told that case (a) was working) were with Open MPI 1.4.x. Should I open a bug for this issue? > allocation was different. The correct allocation I only got with "mpiexec -np > 1". In your case 4 were routed to one remote machine: the machine where the > jobscript runs is usually the first entry in the machinefile, but on grid-03 > you got only one slot by accident, and so the 4 additional ones were routed to > the next machine it found in the machinefile. FYI, I think that this particular allocation was a mis-configuration on my side. It looks like SGE thinks that grid-03 only has 1 slot available. >> 4. Based on "d", I thought that I could follow the approach in "a". That >> is, for experiment "e", I used mpiexec -np 1, but I also used -pe orte 5-5. >> I thought that this would make the multi-machine queue reserve only the 5 >> slots that I needed. The single machine queue works correctly, but now the >> multi-machine case hangs with no errors. The output from qstat and pstree >> are what I'd expect, but it seems to hang in Span_multiple and Init_thread. >> I really expected this to work. > > Yes, this should work across multiple machines. And it's using `qrsh -inherit > ...` so it's failing somewhere in Open MPI - is it working with 1.4.4? I'm not sure. We no longer have our 1.4 test environment, so I'm in the process of building that now. I'll let you know once I have a chance to run that experiment. Thanks, ---Tom