Hi Todd I truly appreciate your patience. If the rate was the same with that switch set, then that would indicate to me that we aren't having trouble getting through the slapd - it probably isn't a problem with how hard we are driving it, but rather with the total number of connections being created. Basically, we need to establish one connection/node to launch the orteds (the app procs are just fork/exec'd by the orteds so they shouldn't see the slapd).
The issue may have to do with limits on the total number of LDAP authentication connections allowed for one user. I believe that is settable, but will have to look it up and/or ask a few friends that might know. I have not seen an LDAP-based cluster before (though authentication onto the head node of a cluster is frequently handled that way), but that doesn't mean someone hasn't done it. Again, appreciate the patience. Ralph On 2/7/07 10:28 AM, "Heywood, Todd" <heyw...@cshl.edu> wrote: > Hi Ralph, Unfortunately, adding "-mca pls_rsh_num_concurrent 50" to mpirun > (with just -np and -hostfile) has no effect. The number of established > connections for slapd grows to the same number at the same rate as without it. > BTW, I upgraded from 1.2b2 to 1.2b3 Thanks, TOdd -----Original > Message----- From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Tuesday, > February 06, 2007 6:48 PM To: Open MPI Users Subject: Re: [OMPI users] large > jobs hang on startup (deadlock?) Hi Todd Just as a thought - you could try > not using --debug-daemons or -d and instead setting "-mca > pls_rsh_num_concurrent 50" or some such small number. This will tell the > system to launch 50 ssh calls at a time, waiting for each group to complete > before launching the next. You can't use it with --debug-daemons as that > option prevents the ssh calls from "closing" so that you can get the output > from the daemons. You can still launch as big a job as you like - we'll just > do it 50 ssh calls at a time. If we are truly overwhelming the slapd, then > this should alleviate the problem. Let me know if you get to try > it... Ralph On 2/6/07 4:05 PM, "Heywood, Todd" <heyw...@cshl.edu> wrote: > > Hi Ralph, It looks that way. I created a user local to each node, with > local > authentication via /etc/passwd and /etc/shadow, and OpenMPI scales up > just > fine for that. I know this is an OpenMPI list, but does anyone know > how > common or uncommon LDAP-based clusters are? I would have thought this > issue > would have arisen elsewhere, but Googling MPI+LDAP (and similar) > doesn't turn > up much. I'd certainly be willing to test any patch. > > Thanks. Todd -----Original Message----- From: users-boun...@open-mpi.org > > [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph H Castain Sent: > > Tuesday, February 06, 2007 9:54 AM To: Open MPI Users > > <us...@open-mpi.org> Subject: Re: [OMPI users] large jobs hang on startup > > (deadlock?) It sounds to me like we are probably overwhelming your slapd - > > your test would seem to indicate that slowing down the slapd makes us fail > > even with smaller jobs, which tends to support that idea. We frankly > haven't > encountered that before since our rsh tests have all been done using > non-LDAP > authentication (basically, we ask that you setup rsh > to auto-authenticate on > each node). It sounds like we need to add an ability > to slow down so that the > daemon doesn't "fail" due to authentication > timeout and/or slapd rejection due > to the queue being full. This may take a > little time to fix due to other > priorities, and will almost certainly have > to be released in a subsequent > 1.2.x version. Meantime, I'll let you know > when I get something to test - > would you be willing to give it a shot if I > provide a patch? I don't have > access to an LDAP-based system. Ralph On > 2/6/07 7:44 AM, "Heywood, Todd" > <heyw...@cshl.edu> wrote: > Hi > Ralph, Thanks for the reply. This is a tough > one. It is OpenLDAP. I had > > thought that I might be hitting a file descriptor > limit for slapd (LDAP > > daemon), which ulimit -n does not effect (you have to > rebuild LDAP with a > > different FD_SETSIZE variable). However, I simply turned > on more > expressive > logging to /var/log/slapd, and that resulted in smaller > jobs > (which > successfully ran before) hanging. Go figure. It appears that > > daemons are up > and running (from ps), and everything hangs in MPI_Init. > > Ctl-C > gives [blade1:04524] ERROR: A daemon on node blade26 failed to > start > as > expected. [blade1:04524] ERROR: There may be more information > available > > from [blade1:04524] ERROR: the remote shell (see > above). [blade1:04524] > ERROR: > The daemon exited unexpectedly with status > 255. I'm interested in > any > suggestion, semi-fixes, etc. which might help > get to the bottom of this. > Right > now: whether the daemons are indeed up > and running, or if there are > some that > are not (causing MPI_Init to > > hang). Thanks, Todd -----Original > Message----- From: > > users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf > Of > Ralph H Castain Sent: > Tuesday, February 06, 2007 8:52 AM To: Open MPI > > Users > <us...@open-mpi.org> Subject: Re: [OMPI users] large jobs hang on > > startup > (deadlock?) Well, I can't say for sure about LDAP. I did a quick > > search and > found two things: 1. there are limits imposed in LDAP that may > > apply to your > situation, and 2. that statement varies tremendously > > depending upon the > specific LDAP implementation you are using I would > > suggest you see which LDAP > you are using and contact the > > respective organization to ask if they do have > such a limit, and if so, > how > to adjust it. It sounds like maybe we are > hitting the LDAP server > with too > many requests too rapidly. Usually, the issue > is not starting > fast enough, > so this is a new one! We don't currently check to > see if > everything started > up okay, so that is why the processes might hang - > we > hope to fix that soon. > I'll have to see if there is something we can do to > > help alleviate such > problems - might not be in time for the 1.2 release, > but > perhaps it will > make a subsequent "fix" or, if you are > willing/interested, I > could provide > it to you as a "patch" you could use > until a later official > > release. Meantime, you might try upgrading to > 1.2b3 or even a nightly > release > from the trunk. There are known problems > with 1.2b2 (which is why > there is a > b3 and soon to be an rc1), though I > don't think that will be the > problem > here. At the least, the nightly trunk > has a much better response to > ctrl-c in > it. Ralph On 2/5/07 9:50 AM, > "Heywood, Todd" <heyw...@cshl.edu> > wrote: > > Hi Ralph, > > Thanks for > the reply. The OpenMPI version is 1.2b2 > (because I > would like to > > integrate it with SGE). > > Here is what is > happening: > > > (1) > When I run with debug-daemons (but WITHOUT d), I > get ³Daemon> > [0,0,27] > checking in as pid 7620 on host blade28² (for example) > messages for > > most > but not all of the daemons that should be started up, > and then it > hangs. > > I also notice ³reconnecting to LDAP server² messages in > various > > > /var/log/secure files, and cannot login while things are hung > (with ³su:> > > pam_ldap: ldap_result Can't contact LDAP server² in > /var/log/messages). > So > > apparently LDAP hits some limit to opening ssh > sessions, and I¹m not > sure > how > to address this. > (2) When I run with > debug-daemons AND > the debug > option d, all daemons > start start up and > check-in, albeit > slowly (debug > must slow things down so > LDAP can handle > all the > requests??). Then > apparently, the cpi process is > started for each > task > but it then hangs: > > > [blade1:23816] spawn: in > job_state_callback(jobid > = 1, state = 0x4) > > [blade1:23816] Info: Setting up > debugger process table > for applications > > MPIR_being_debugged = 0 > > MPIR_debug_gate = 0 > > MPIR_debug_state = 1 > > MPIR_acquired_pre_main = 0 > > MPIR_i_am_starter = > 0 > MPIR_proctable_size = > 800 > MPIR_proctable: > > (i, host, exe, pid) = > (0, blade1, > /home4/itstaff/heywood/ompi/cpi, 24193) > > Š > Š(i, host, exe, > pid) = (799, > blade213, /home4/itstaff/heywood/ompi/cpi, > 4762) > > A ³ps² > on the head node > shows 200 open ssh sessions, and 4 cpi > processes doing > > nothing. A ^C gives > this: > > mpirun: killing job... > > > > > > -------------------------------------------------------------------------- > > > > WARNING: A process refused to die! > > Host: blade1 > PID: 24193 > > > > This > process may still be running and/or consuming resources. > > > > > > > > Still got a ways to go, but any ideas/suggestions are welcome! > > > > Thanks, > > > Todd > > > > > From: users-boun...@open-mpi.org > > > [mailto:users-boun...@open-mpi.org] On Behalf > Of Ralph Castain > Sent: > > > Friday, February 02, 2007 5:20 PM > To: Open MPI Users > Subject: Re: > [OMPI > > users] large jobs hang on startup (deadlock?) > > Hi Todd > > To > help us > > provide advice, could you tell us what version of OpenMPI you > are > using? > > > > Meantime, try adding ³-mca pls_rsh_num_concurrent 200² > to your mpirun > > command > line. You can up the number of concurrent daemons > we launch to > > anything your > system will support basically, we limit the > number only > > because some systems > have limits on the number of ssh calls > we can have > > active at any one time. > Because we hold stdio open when > running with > > ‹debug-daemons, the number of > concurrent daemons must match > or exceed the > > number of nodes you are trying to > launch on. > > I have a > ³fix² in the > > works that will help relieve some of that restriction, > but > that won¹t come > > out until a later release. > > Hopefully, that will allow > you to obtain > more > debug info about why/where > things are hanging. > > > Ralph > > > On > 2/2/07 > 11:41 AM, "Heywood, Todd" <heyw...@cshl.edu> > wrote: > I have OpenMPI > running > fine for a small/medium number of tasks > (simple hello > or cpi > program). But > when I try 700 or 800 tasks, it > hangs, apparently on > > startup. I think this > might be related to LDAP, > since if I try to log into > my > account while the > job is hung, I get told > my username doesn¹t exist. > However, > I tried adding > debug to the mpirun, > and got the same sequence of > output as > for successful > smaller runs, > until it hung again. So I added > -debug-daemons > and got this > (with an > exit, i.e. no hanging): > Š > > [blade1:31733] [0,0,0] wrote setup > file > > > > > -------------------------------------------------------------------------- > > > > The rsh launcher has been given a number of 128 concurrent daemons to > > > launch > and is in a debug-daemons option. However, the total number of > > > daemons to > launch (200) is greater than this value. This is a scenario > > that > will cause > the system to deadlock. > > To avoid deadlock, either > > increase the number of > concurrent daemons, or > remove the debug-daemons > > flag. > > > > -------------------------------------------------------------------------- > > > > [blade1:31733] [0,0,0] ORTE_ERROR_LOG: Fatal in file > > > > ../../../../../orte/mca/rmgr/urm/ > rmgr_urm.c at line 455 > > [blade1:31733] > > mpirun: spawn failed with errno=-6 > [blade1:31733] > sess_dir_finalize: proc > > session dir not empty - leaving > > Any ideas or > suggestions > appreciated. > > > Todd Heywood > > > > > > > > > > _______________________________________________ > users mailing list > > > > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > _______________________________________________ > users mailing > > list > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________ > > > ________________________ users mailing > > > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users __ > > > _____________________________________________ users mailing > > > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ users mailing > > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users __ > > _____________________________________________ users mailing > > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ users mailing > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users __ > _____________________________________________ users mailing > list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users