[OMPI users] problem with data transfer in a heterogeneous environment
Hi, some weeks ago I reported a problem with my matrix multiplication program in a heterogeneous environment (little endian and big endian machines). The problem occurs in openmpi-1.6.x, openmpi-1.7, and openmpi-1.9. Now I implemented a small program which only scatters the columns of an integer matrix so that it is easier to see what goes wrong. I configured for a heterogeneous environment. Adding "-hetero-nodes" and/or "-hetero-apps" on the command line doesn't change much as you can see at the end of this email. Everything works fine, if I use only little endian or only big endian machines. Is it possible to fix the problem or do you know in which file(s) I would have to look to find the problem or do you know debug switches which would provide more information to solve the problem? I used the following command to configure the package on my "Solaris 10 Sparc" system (the commands for my other systems are similar). Next time I will also add "-without-sctp" to get rid of the failures on my Linux machines (Open SuSE 12.1). ../openmpi-1.9a1r27668/configure --prefix=/usr/local/openmpi-1.9_64_cc \ --libdir=/usr/local/openmpi-1.9_64_cc/lib64 \ --with-jdk-bindir=/usr/local/jdk1.7.0_07/bin/sparcv9 \ --with-jdk-headers=/usr/local/jdk1.7.0_07/include \ JAVA_HOME=/usr/local/jdk1.7.0_07 \ LDFLAGS="-m64" \ CC="cc" CXX="CC" FC="f95" \ CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ CPP="cpp" CXXCPP="cpp" \ CPPFLAGS="" CXXCPPFLAGS="" \ C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \ OBJC_INCLUDE_PATH="" OPENMPI_HOME="" \ --enable-cxx-exceptions \ --enable-mpi-java \ --enable-heterogeneous \ --enable-opal-multi-threads \ --enable-mpi-thread-multiple \ --with-threads=posix \ --with-hwloc=internal \ --without-verbs \ --without-udapl \ --with-wrapper-cflags=-m64 \ --enable-debug \ |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc tyr small_prog 501 ompi_info | grep -e Ident -e Hetero -e "Built on" Ident string: 1.9a1r27668 Built on: Wed Dec 12 09:00:13 CET 2012 Heterogeneous support: yes tyr small_prog 502 tyr small_prog 488 mpiexec -np 6 -host sunpc0,rs0 column_int matrix: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 1: 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 2: 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 3: 0x5678 0x1234 0x5678 0x1234ce71 Column of process 4: 0x5678 0x1234 0x5678 0x1234ce71 Column of process 0: 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 5: 0x5678 0x1234 0x5678 0x1234ce71 tyr small_prog 489 tyr small_prog 489 mpiexec -np 6 -host rs0,sunpc0 column_int matrix: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 1: Column of process 2: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 3: 0xffdf1234 0x5678 0x401234 0x5678 Column of process 4: 0xffdf1234 0x5678 0x401234 0x5678 Column of process 0: 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 5: 0xffdf1234 0x5678 0x401234 0x5678 tyr small_prog 490 tyr small_prog 491 mpiexec -np 6 -mca btl ^sctp -host rs0,linpc0 column_int matrix: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 1: Column of process 2: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 3: 0x1234 0x5678 0xf71c1234 0x5678 Column of process 4: 0x1234 0x5678 0xc6011234 0x5678 Column of process 0: 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 5: 0x1234 0x5678 0x426f1234 0x5678 tyr small_prog 492 tyr small_prog 492 mpiexec -np 6 -mca btl ^sctp -host linpc0,rs0 column_int matrix: 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 Column of process 2: 0x12345678 0x12345678 0x12345678 0x12345678 Column
[OMPI users] questions to some open problems
Hi, some weeks ago (mainly in the beginning of October) I reported several problems and I would be grateful if you can tell me if and probably when somebody will try to solve them. 1) I don't get the expected results, when I try to send or scatter the columns of a matrix in Java. The received column values have nothing to do with the original values, if I use a homogeneous environment and the program breaks with "An error occurred in MPI_Comm_dup" and "MPI_ERR_INTERN: internal error", if I use a heterogeneous environment. I would like to use the Java API. 2) I don't get the expected result, when I try to scatter an object in Java. https://svn.open-mpi.org/trac/ompi/ticket/3351 3) I still get only a message that all nodes are already filled up when I use a "rankfile" and nothing else happens. I would like to use a rankfile. You filed a bug fix for it. 4) I would like to have "-cpus-per-proc", "-npersocket", etc for every set of machines/applications and not globally for all machines/applications if I specify several colon-separated sets of machines or applications on the command line. You told me that it could be done. 5) By the way, it seems that the option "-cpus-per-proc" isn't any longer supported in openmpi-1.7 and openmpi-1.9. How can I bind a multi-threaded process to more than one core in these versions? I can provide my small programs once more if you need them. Thank you very much for any answer in advance. Kind regards Siegmar
Re: [OMPI users] problem with data transfer in a heterogeneous environment
Disturbing, but I don't know if/when someone will address it. The problem really is that few, if any, of the developers have access to hetero systems. So developing and testing hetero support is difficult to impossible. I'll file a ticket about it and direct it to the attention of the person who developed the datatype support - he might be able to look at it, or at least provide some direction. On Dec 14, 2012, at 5:52 AM, Siegmar Gross wrote: > Hi, > > some weeks ago I reported a problem with my matrix multiplication > program in a heterogeneous environment (little endian and big endian > machines). The problem occurs in openmpi-1.6.x, openmpi-1.7, and > openmpi-1.9. Now I implemented a small program which only scatters > the columns of an integer matrix so that it is easier to see what > goes wrong. I configured for a heterogeneous environment. Adding > "-hetero-nodes" and/or "-hetero-apps" on the command line doesn't > change much as you can see at the end of this email. Everything > works fine, if I use only little endian or only big endian machines. > Is it possible to fix the problem or do you know in which file(s) > I would have to look to find the problem or do you know debug > switches which would provide more information to solve the problem? > I used the following command to configure the package on my "Solaris > 10 Sparc" system (the commands for my other systems are similar). > Next time I will also add "-without-sctp" to get rid of the failures > on my Linux machines (Open SuSE 12.1). > > ../openmpi-1.9a1r27668/configure --prefix=/usr/local/openmpi-1.9_64_cc \ > --libdir=/usr/local/openmpi-1.9_64_cc/lib64 \ > --with-jdk-bindir=/usr/local/jdk1.7.0_07/bin/sparcv9 \ > --with-jdk-headers=/usr/local/jdk1.7.0_07/include \ > JAVA_HOME=/usr/local/jdk1.7.0_07 \ > LDFLAGS="-m64" \ > CC="cc" CXX="CC" FC="f95" \ > CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > CPPFLAGS="" CXXCPPFLAGS="" \ > C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \ > OBJC_INCLUDE_PATH="" OPENMPI_HOME="" \ > --enable-cxx-exceptions \ > --enable-mpi-java \ > --enable-heterogeneous \ > --enable-opal-multi-threads \ > --enable-mpi-thread-multiple \ > --with-threads=posix \ > --with-hwloc=internal \ > --without-verbs \ > --without-udapl \ > --with-wrapper-cflags=-m64 \ > --enable-debug \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc > > > > tyr small_prog 501 ompi_info | grep -e Ident -e Hetero -e "Built on" >Ident string: 1.9a1r27668 >Built on: Wed Dec 12 09:00:13 CET 2012 > Heterogeneous support: yes > tyr small_prog 502 > > > tyr small_prog 488 mpiexec -np 6 -host sunpc0,rs0 column_int > > matrix: > > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > > > Column of process 1: > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 2: > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 3: > 0x5678 0x1234 0x5678 0x1234ce71 > > Column of process 4: > 0x5678 0x1234 0x5678 0x1234ce71 > > Column of process 0: > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 5: > 0x5678 0x1234 0x5678 0x1234ce71 > tyr small_prog 489 > > > > > tyr small_prog 489 mpiexec -np 6 -host rs0,sunpc0 column_int > > matrix: > > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > > > Column of process 1: > > Column of process 2: > 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 3: > 0xffdf1234 0x5678 0x401234 0x5678 > > Column of process 4: > 0xffdf1234 0x5678 0x401234 0x5678 > > Column of process 0: > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 5: > 0xffdf1234 0x5678 0x401234 0x5678 > tyr small_prog 490 > > > > > tyr small_prog 491 mpiexec -np 6 -mca btl ^sctp -host rs0,linpc0 column_int > > matrix: > > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 0x12345678 > > > Column of process 1: > > Column of process 2: > 0x12345678 0x12345678 0x12345678 0x12345678 > 0x12345678 0x12345678 0x12345678 0x12345678 > > Column of process 3: > 0x123
Re: [OMPI users] questions to some open problems
Hi Siegmar On Dec 14, 2012, at 5:54 AM, Siegmar Gross wrote: > Hi, > > some weeks ago (mainly in the beginning of October) I reported > several problems and I would be grateful if you can tell me if > and probably when somebody will try to solve them. > > 1) I don't get the expected results, when I try to send or scatter > the columns of a matrix in Java. The received column values have > nothing to do with the original values, if I use a homogeneous > environment and the program breaks with "An error occurred in > MPI_Comm_dup" and "MPI_ERR_INTERN: internal error", if I use > a heterogeneous environment. I would like to use the Java API. > > 2) I don't get the expected result, when I try to scatter an object > in Java. > https://svn.open-mpi.org/trac/ompi/ticket/3351 Nothing has happened on these yet > > 3) I still get only a message that all nodes are already filled up > when I use a "rankfile" and nothing else happens. I would like > to use a rankfile. You filed a bug fix for it. > I believe rankfile was fixed, at least on the trunk - not sure if it was moved to 1.7. I assume that's the release you are talking about? > 4) I would like to have "-cpus-per-proc", "-npersocket", etc for > every set of machines/applications and not globally for all > machines/applications if I specify several colon-separated sets > of machines or applications on the command line. You told me that > it could be done. > > 5) By the way, it seems that the option "-cpus-per-proc" isn't any > longer supported in openmpi-1.7 and openmpi-1.9. How can I bind a > multi-threaded process to more than one core in these versions? I'm afraid I haven't gotten around to working on cpus-per-proc, though I believe npersocket was fixed. > > I can provide my small programs once more if you need them. Thank > you very much for any answer in advance. > > > Kind regards > > Siegmar > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Problems with shared libraries while launching jobs
I am having a weird problem launching cases with OpenMPI 1.4.3. It is most likely a problem with a particular node of our cluster, as the jobs will run fine on some submissions, but not other submissions. It seems to depend on the node list. I just am having trouble diagnosing which node, and what is the nature of the problem it has. One or perhaps more of the orted are indicating they cannot find an Intel Math library. The error is: /release/cfd/openmpi-intel/bin/orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory I've checked the environment just before launching mpirun, and LD_LIBRARY_PATH includes the necessary component to point to where the Intel shared libraries are located. Furthermore, my mpirun command line says to export the LD_LIBRARY_PATH variable: Executing ['/release/cfd/openmpi-intel/bin/mpirun', '--machinefile /var/spool/PBS/aux/20761.maruhpc4-mgt', '-np 160', '-x LD_LIBRARY_PATH', '-x MPI_ENVIRONMENT=1', '/tmp/fv420761.maruhpc4-mgt/falconv4_openmpi_jsgl', '-v', '-cycles', '1', '-ri', 'restart.1', '-ro', '/tmp/fv420761.maruhpc4-mgt/restart.1'] My shell-initialization script (.bashrc) does not overwrite LD_LIBRARY_PATH. OpenMPI is built explicitly --without-torque and should be using ssh to launch the orted. What options can I add to get more debugging of problems launching orted? Thanks, Ed
Re: [OMPI users] Problems with shared libraries while launching jobs
Add -mca plm_base_verbose 5 --leave-session-attached to the cmd line - that will show the ssh command being used to start each orted. On Dec 14, 2012, at 12:17 PM, "Blosch, Edwin L" wrote: > I am having a weird problem launching cases with OpenMPI 1.4.3. It is most > likely a problem with a particular node of our cluster, as the jobs will run > fine on some submissions, but not other submissions. It seems to depend on > the node list. I just am having trouble diagnosing which node, and what is > the nature of the problem it has. > > One or perhaps more of the orted are indicating they cannot find an Intel > Math library. The error is: > /release/cfd/openmpi-intel/bin/orted: error while loading shared libraries: > libimf.so: cannot open shared object file: No such file or directory > > I’ve checked the environment just before launching mpirun, and > LD_LIBRARY_PATH includes the necessary component to point to where the Intel > shared libraries are located. Furthermore, my mpirun command line says to > export the LD_LIBRARY_PATH variable: > Executing ['/release/cfd/openmpi-intel/bin/mpirun', '--machinefile > /var/spool/PBS/aux/20761.maruhpc4-mgt', '-np 160', '-x LD_LIBRARY_PATH', '-x > MPI_ENVIRONMENT=1', '/tmp/fv420761.maruhpc4-mgt/falconv4_openmpi_jsgl', '-v', > '-cycles', '1', '-ri', 'restart.1', '-ro', > '/tmp/fv420761.maruhpc4-mgt/restart.1'] > > My shell-initialization script (.bashrc) does not overwrite LD_LIBRARY_PATH. > OpenMPI is built explicitly --without-torque and should be using ssh to > launch the orted. > > What options can I add to get more debugging of problems launching orted? > > Thanks, > > Ed > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] mpi problems/many cpus per node
I have had to cobble together two machines in our rocks cluster without using the standard installation, they have efi only bios on them and rocks doesnt like that, so it is the only workaround. Everything works great now, except for one thing. MPI jobs (openmpi or mpich) fail when started from one of these nodes (via qsub or by logging in and running the command) if 24 or more processors are needed on another system. However if the originator of the MPI job is the headnode or any of the preexisting compute nodes, it works fine. Right now I am guessing ssh client or ulimit problems, but I cannot find any difference. Any help would be greatly appreciated. compute-2-1 and compute-2-0 are the new nodes Examples: This works, prints 23 hostnames from each machine: [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 46 hostname This does not work, prints 24 hostnames for compute-2-1 [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 48 hostname These both work, print 64 hostnames from each node [root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 128 hostname [root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 128 hostname [root@compute-2-1 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 16410016 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited [root@compute-2-1 ~]# more /etc/ssh/ssh_config Host * CheckHostIP no ForwardX11 yes ForwardAgentyes StrictHostKeyChecking no UsePrivilegedPort no Protocol2,1
Re: [OMPI users] mpi problems/many cpus per node
It wouldn't be ssh - in both cases, only one ssh is being done to each node (to start the local daemon). The only difference is the number of fork/exec's being done on each node, and the number of file descriptors being opened to support those fork/exec's. It certainly looks like your limits are high enough. When you say it "fails", what do you mean - what error does it report? Try adding: --leave-session-attached -mca odls_base_verbose 5 to your cmd line - this will report all the local proc launch debug and hopefully show you a more detailed error report. On Dec 14, 2012, at 12:29 PM, Daniel Davidson wrote: > I have had to cobble together two machines in our rocks cluster without using > the standard installation, they have efi only bios on them and rocks doesnt > like that, so it is the only workaround. > > Everything works great now, except for one thing. MPI jobs (openmpi or > mpich) fail when started from one of these nodes (via qsub or by logging in > and running the command) if 24 or more processors are needed on another > system. However if the originator of the MPI job is the headnode or any of > the preexisting compute nodes, it works fine. Right now I am guessing ssh > client or ulimit problems, but I cannot find any difference. Any help would > be greatly appreciated. > > compute-2-1 and compute-2-0 are the new nodes > > Examples: > > This works, prints 23 hostnames from each machine: > [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host > compute-2-0,compute-2-1 -np 46 hostname > > This does not work, prints 24 hostnames for compute-2-1 > [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host > compute-2-0,compute-2-1 -np 48 hostname > > These both work, print 64 hostnames from each node > [root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host > compute-2-0,compute-2-1 -np 128 hostname > [root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host > compute-2-0,compute-2-1 -np 128 hostname > > [root@compute-2-1 ~]# ulimit -a > core file size (blocks, -c) 0 > data seg size (kbytes, -d) unlimited > scheduling priority (-e) 0 > file size (blocks, -f) unlimited > pending signals (-i) 16410016 > max locked memory (kbytes, -l) unlimited > max memory size (kbytes, -m) unlimited > open files (-n) 4096 > pipe size(512 bytes, -p) 8 > POSIX message queues (bytes, -q) 819200 > real-time priority (-r) 0 > stack size (kbytes, -s) unlimited > cpu time (seconds, -t) unlimited > max user processes (-u) 1024 > virtual memory (kbytes, -v) unlimited > file locks (-x) unlimited > > [root@compute-2-1 ~]# more /etc/ssh/ssh_config > Host * >CheckHostIP no >ForwardX11 yes >ForwardAgentyes >StrictHostKeyChecking no >UsePrivilegedPort no >Protocol2,1 > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Possible memory error
Folks, I'm trying to track down an instance of openMPI writing to a freed block of memory. This occurs with the most recent release (1.6.3) as well as 1.6, on a 64 bit intel architecture, fedora 14. It occurs with a very simple reduction (allreduce minimum), over a single int value. Has anyone had any recent problems like this? It may be showing up as an intermittent error (i.e. there's no problem as long as the allocated block hasn't been re-allocated, which depends upon malloc). You may not know about it unless you've been debugging malloc with valgrind or dmalloc or the like. I'm wondering if the openMPI developers use power tools such as valgrind / dmalloc / etc on the releases to try to catch these things via exhaustive testing - but I understand memory problems in C are of the nature that anyone making a mistake can propogate, so I haven't ruled out problems in our own code. Also, I'm wondering if anyone has suggestions on how to track this down further. I'm using allinea DDT and their builtin dmalloc, which catches the error, which appears in the second memcpy in opal_convertor_pack(), but I don't have more details than that at the moment. All I know so far is that one of those values has been freed. Obviously, I haven't seen anything in earlier parts of the code which might have triggered memory corruption, although both openMPI and intel IPP do things with uninitialized values before this (according to Valgrind). Steve H.
Re: [OMPI users] mpi problems/many cpus per node
Oddly enough, adding this debugging info, lowered the number of processes that can be used down to 42 from 46. When I run the MPI, it fails giving only the information that follows: [root@compute-2-1 ssh]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -v -np 44 --leave-session-attached -mca odls_base_verbose 5 hostname [compute-2-1.local:44374] mca:base:select:( odls) Querying component [default] [compute-2-1.local:44374] mca:base:select:( odls) Query of component [default] set priority to 1 [compute-2-1.local:44374] mca:base:select:( odls) Selected component [default] [compute-2-0.local:28950] mca:base:select:( odls) Querying component [default] [compute-2-0.local:28950] mca:base:select:( odls) Query of component [default] set priority to 1 [compute-2-0.local:28950] mca:base:select:( odls) Selected component [default] compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local compute-2-1.local On 12/14/2012 03:18 PM, Ralph Castain wrote: It wouldn't be ssh - in both cases, only one ssh is being done to each node (to start the local daemon). The only difference is the number of fork/exec's being done on each node, and the number of file descriptors being opened to support those fork/exec's. It certainly looks like your limits are high enough. When you say it "fails", what do you mean - what error does it report? Try adding: --leave-session-attached -mca odls_base_verbose 5 to your cmd line - this will report all the local proc launch debug and hopefully show you a more detailed error report. On Dec 14, 2012, at 12:29 PM, Daniel Davidson wrote: I have had to cobble together two machines in our rocks cluster without using the standard installation, they have efi only bios on them and rocks doesnt like that, so it is the only workaround. Everything works great now, except for one thing. MPI jobs (openmpi or mpich) fail when started from one of these nodes (via qsub or by logging in and running the command) if 24 or more processors are needed on another system. However if the originator of the MPI job is the headnode or any of the preexisting compute nodes, it works fine. Right now I am guessing ssh client or ulimit problems, but I cannot find any difference. Any help would be greatly appreciated. compute-2-1 and compute-2-0 are the new nodes Examples: This works, prints 23 hostnames from each machine: [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 46 hostname This does not work, prints 24 hostnames for compute-2-1 [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 48 hostname These both work, print 64 hostnames from each node [root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 128 hostname [root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -np 128 hostname [root@compute-2-1 ~]# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 16410016 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited [root@compute-2-1 ~]# more /etc/ssh/ssh_config Host * CheckHostIP no ForwardX11 yes ForwardAgentyes StrictHostKeyChecking no UsePrivilegedPort no Protocol2,1 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] mpi problems/many cpus per node
Sorry - I forgot that you built from a tarball, and so debug isn't enabled by default. You need to configure --enable-debug. On Dec 14, 2012, at 1:52 PM, Daniel Davidson wrote: > Oddly enough, adding this debugging info, lowered the number of processes > that can be used down to 42 from 46. When I run the MPI, it fails giving > only the information that follows: > > [root@compute-2-1 ssh]# /home/apps/openmpi-1.6.3/bin/mpirun -host > compute-2-0,compute-2-1 -v -np 44 --leave-session-attached -mca > odls_base_verbose 5 hostname > [compute-2-1.local:44374] mca:base:select:( odls) Querying component [default] > [compute-2-1.local:44374] mca:base:select:( odls) Query of component > [default] set priority to 1 > [compute-2-1.local:44374] mca:base:select:( odls) Selected component [default] > [compute-2-0.local:28950] mca:base:select:( odls) Querying component [default] > [compute-2-0.local:28950] mca:base:select:( odls) Query of component > [default] set priority to 1 > [compute-2-0.local:28950] mca:base:select:( odls) Selected component [default] > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > compute-2-1.local > > > On 12/14/2012 03:18 PM, Ralph Castain wrote: >> It wouldn't be ssh - in both cases, only one ssh is being done to each node >> (to start the local daemon). The only difference is the number of >> fork/exec's being done on each node, and the number of file descriptors >> being opened to support those fork/exec's. >> >> It certainly looks like your limits are high enough. When you say it >> "fails", what do you mean - what error does it report? Try adding: >> >> --leave-session-attached -mca odls_base_verbose 5 >> >> to your cmd line - this will report all the local proc launch debug and >> hopefully show you a more detailed error report. >> >> >> On Dec 14, 2012, at 12:29 PM, Daniel Davidson wrote: >> >>> I have had to cobble together two machines in our rocks cluster without >>> using the standard installation, they have efi only bios on them and rocks >>> doesnt like that, so it is the only workaround. >>> >>> Everything works great now, except for one thing. MPI jobs (openmpi or >>> mpich) fail when started from one of these nodes (via qsub or by logging in >>> and running the command) if 24 or more processors are needed on another >>> system. However if the originator of the MPI job is the headnode or any of >>> the preexisting compute nodes, it works fine. Right now I am guessing ssh >>> client or ulimit problems, but I cannot find any difference. Any help >>> would be greatly appreciated. >>> >>> compute-2-1 and compute-2-0 are the new nodes >>> >>> Examples: >>> >>> This works, prints 23 hostnames from each machine: >>> [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host >>> compute-2-0,compute-2-1 -np 46 hostname >>> >>> This does not work, prints 24 hostnames for compute-2-1 >>> [root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host >>> compute-2-0,compute-2-1 -np 48 hostname >>> >>> These both work, print 64 hostnames from each node >>> [root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host >>> compute-2-0,compute-2-1 -np 128 hostname >>> [root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host >>> compute-2-0,compute-2-1 -np 128 hostname >>> >>> [root@compute-2-1 ~]# ulimit -a >>> core file size (blocks, -c) 0 >>> data seg size (kbytes, -d) unlimited >>> scheduling priority (-e) 0 >>> file size (blocks, -f) unlimited >>> pending signals (-i) 16410016 >>> max locked memory (kbytes, -l) unlimited >>> max memory size (kbytes, -m) unlimited >>> open files (-n) 4096 >>> pipe size(512 bytes, -p) 8 >>> POSIX message queues (bytes, -q) 819200 >>> real-time priority (-r) 0 >>> stack size (kbytes, -s) unlimited >>> cpu time (seconds, -t) unlimited >>> max user processes (-u) 1024 >>> virtual memory (kbytes, -v) unlimited >>> file locks (-x) unlimited >>> >>> [root@compute-2-1 ~]# more /etc/ssh/ssh_config >>> Host * >>>CheckHostIP no >>>ForwardX11 yes >>>ForwardAgentyes >>>StrictHostKeyChecking no >>>UsePrivilegedPort no >>>Protocol2,1 >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> u
Re: [OMPI users] mpi problems/many cpus per node
Thank you for the help so far. Here is the information that the debugging gives me. Looks like the daemon on on the non-local node never makes contact. If I step NP back two though, it does. Dan [root@compute-2-1 etc]# /home/apps/openmpi-1.6.3/bin/mpirun -host compute-2-0,compute-2-1 -v -np 34 --leave-session-attached -mca odls_base_verbose 5 hostname [compute-2-1.local:44855] mca:base:select:( odls) Querying component [default] [compute-2-1.local:44855] mca:base:select:( odls) Query of component [default] set priority to 1 [compute-2-1.local:44855] mca:base:select:( odls) Selected component [default] [compute-2-0.local:29282] mca:base:select:( odls) Querying component [default] [compute-2-0.local:29282] mca:base:select:( odls) Query of component [default] set priority to 1 [compute-2-0.local:29282] mca:base:select:( odls) Selected component [default] [compute-2-1.local:44855] [[49524,0],0] odls:update:daemon:info updating nidmap [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list [compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking data to launch job [49524,1] [compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list adding new jobdat for job [49524,1] [compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking 1 app_contexts [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],0] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],1] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],1] for me! [compute-2-1.local:44855] adding proc [[49524,1],1] (1) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],2] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],3] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],3] for me! [compute-2-1.local:44855] adding proc [[49524,1],3] (3) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],4] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],5] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],5] for me! [compute-2-1.local:44855] adding proc [[49524,1],5] (5) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],6] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],7] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],7] for me! [compute-2-1.local:44855] adding proc [[49524,1],7] (7) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],8] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],9] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],9] for me! [compute-2-1.local:44855] adding proc [[49524,1],9] (9) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],10] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],11] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],11] for me! [compute-2-1.local:44855] adding proc [[49524,1],11] (11) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],12] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],13] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],13] for me! [compute-2-1.local:44855] adding proc [[49524,1],13] (13) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],14] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],15] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],15] for me! [compute-2-1.local:44855] adding proc [[49524,1],15] (15) to my local list [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],16] on daemon 1 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking proc [[49524,1],17] on daemon 0 [compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found proc [[49524,1],17] for me! [compute-2-1.local:44855] adding proc [[49524,1],17] (17) to my local list [compute-2-1.local: