Re: [OMPI users] Problem in remote nodes
I've benn investigating and there is no firewall that could stop TCP traffic in the cluster. With the option --mca plm_base_verbose 30 I get the following output: [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 helloworld.out [itanium1:08311] mca: base: components_open: Looking for plm components [itanium1:08311] mca: base: components_open: opening plm components [itanium1:08311] mca: base: components_open: found loaded component rsh [itanium1:08311] mca: base: components_open: component rsh has no register function [itanium1:08311] mca: base: components_open: component rsh open function successful [itanium1:08311] mca: base: components_open: found loaded component slurm [itanium1:08311] mca: base: components_open: component slurm has no register function [itanium1:08311] mca: base: components_open: component slurm open function successful [itanium1:08311] mca:base:select: Auto-selecting plm components [itanium1:08311] mca:base:select:( plm) Querying component [rsh] [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set priority to 10 [itanium1:08311] mca:base:select:( plm) Querying component [slurm] [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. Query failed to return a module [itanium1:08311] mca:base:select:( plm) Selected component [rsh] [itanium1:08311] mca: base: close: component slurm closed [itanium1:08311] mca: base: close: unloading component slurm --Hangs here It seems a slurm problem?? Thanks to any idea El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: > Did you configure OMPI with --enable-debug? You should do this so that > more diagnostic output is available. > > You can also add the following to your cmd line to get more info: > > --debug --debug-daemons --leave-session-attached > > Something is likely blocking proper launch of the daemons and processes so > you aren't getting to the btl's at all. > > > On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: > >> The processes are running on the remote nodes but they don't give the >> response to the origin node. I don't know why. >> With the option --mca btl_base_verbose 30, I have the same problems and >> it >> doesn't show any message. >> >> Thanks >> >>> On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres >>> wrote: On Mar 17, 2010, at 4:39 AM, wrote: > Hi everyone I'm a new Open MPI user and I have just installed Open > MPI > in > a 6 nodes cluster with Scientific Linux. When I execute it in local > it > works perfectly, but when I try to execute it on the remote nodes > with > the > --host option it hangs and gives no message. I think that the > problem > could be with the shared libraries but i'm not sure. In my opinion > the > problem is not ssh because i can access to the nodes with no password You might want to check that Open MPI processes are actually running on the remote nodes -- check with ps if you see any "orted" or other MPI-related processes (e.g., your processes). Do you have any TCP firewall software running between the nodes? If so, you'll need to disable it (at least for Open MPI jobs). >>> >>> I also recommend running mpirun with the option --mca btl_base_verbose >>> 30 to troubleshoot tcp issues. >>> >>> In some environments, you need to explicitly tell mpirun what network >>> interfaces it can use to reach the hosts. Read the following FAQ >>> section for more information: >>> >>> http://www.open-mpi.org/faq/?category=tcp >>> >>> Item 7 of the FAQ might be of special interest. >>> >>> Regards, >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Problem in remote nodes
Looks to me like you have an error in your cmd line - you aren't specifying the number of procs to run. My guess is that the system is hanging trying to resolve the process map as a result. Try adding "-np 1" to the cmd line. The output indicates it is dropping slurm because it doesn't see a slurm allocation. So it is defaulting to use of rsh/ssh to launch. On Mar 30, 2010, at 4:27 AM, uriz.49...@e.unavarra.es wrote: > I've benn investigating and there is no firewall that could stop TCP > traffic in the cluster. With the option --mca plm_base_verbose 30 I get > the following output: > > [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 > helloworld.out > [itanium1:08311] mca: base: components_open: Looking for plm components > [itanium1:08311] mca: base: components_open: opening plm components > [itanium1:08311] mca: base: components_open: found loaded component rsh > [itanium1:08311] mca: base: components_open: component rsh has no register > function > [itanium1:08311] mca: base: components_open: component rsh open function > successful > [itanium1:08311] mca: base: components_open: found loaded component slurm > [itanium1:08311] mca: base: components_open: component slurm has no > register function > [itanium1:08311] mca: base: components_open: component slurm open function > successful > [itanium1:08311] mca:base:select: Auto-selecting plm components > [itanium1:08311] mca:base:select:( plm) Querying component [rsh] > [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set > priority to 10 > [itanium1:08311] mca:base:select:( plm) Querying component [slurm] > [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. Query > failed to return a module > [itanium1:08311] mca:base:select:( plm) Selected component [rsh] > [itanium1:08311] mca: base: close: component slurm closed > [itanium1:08311] mca: base: close: unloading component slurm > > --Hangs here > > It seems a slurm problem?? > > Thanks to any idea > > El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: >> Did you configure OMPI with --enable-debug? You should do this so that >> more diagnostic output is available. >> >> You can also add the following to your cmd line to get more info: >> >> --debug --debug-daemons --leave-session-attached >> >> Something is likely blocking proper launch of the daemons and processes so >> you aren't getting to the btl's at all. >> >> >> On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: >> >>> The processes are running on the remote nodes but they don't give the >>> response to the origin node. I don't know why. >>> With the option --mca btl_base_verbose 30, I have the same problems and >>> it >>> doesn't show any message. >>> >>> Thanks >>> On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres wrote: > On Mar 17, 2010, at 4:39 AM, wrote: > >> Hi everyone I'm a new Open MPI user and I have just installed Open >> MPI >> in >> a 6 nodes cluster with Scientific Linux. When I execute it in local >> it >> works perfectly, but when I try to execute it on the remote nodes >> with >> the >> --host option it hangs and gives no message. I think that the >> problem >> could be with the shared libraries but i'm not sure. In my opinion >> the >> problem is not ssh because i can access to the nodes with no password > > You might want to check that Open MPI processes are actually running > on > the remote nodes -- check with ps if you see any "orted" or other > MPI-related processes (e.g., your processes). > > Do you have any TCP firewall software running between the nodes? If > so, > you'll need to disable it (at least for Open MPI jobs). I also recommend running mpirun with the option --mca btl_base_verbose 30 to troubleshoot tcp issues. In some environments, you need to explicitly tell mpirun what network interfaces it can use to reach the hosts. Read the following FAQ section for more information: http://www.open-mpi.org/faq/?category=tcp Item 7 of the FAQ might be of special interest. Regards, ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] kernel 2.6.23 vs 2.6.24 - communication/wait times
Hello List, I hope you can help us out on that one, as we are trying to figure out since weeks. The situation: We have a program being capable of slitting to several processes to be shared on nodes within a cluster network using openmpi. We were running that system on "older" cluster hardware (Intel Core2 Duo based, 2GB RAM) using an "older" kernel (2.6.18.6). All nodes are diskless network booting. Recently we upgraded the hardware (Intel i5, 8GB RAM) which also required an upgrade to a recent kernel version (2.6.26+). Here is the problem: We experience overall performance loss on the new hardware and think, we can break it down to a communication issue inbetween the processes. Also, we found out, the issue araises in the transition from kernel 2.6.23 to 2.6.24 (tested on the Core2 Duo system). Here is an output from our programm: 2.6.23.17 (64bit), MPI 1.2.7 5 Iterationen (Core2 Duo) 6 CPU: 93.33 seconds per iteration. Node 0 communication/computation time: 6.83 /647.64 seconds. Node 1 communication/computation time: 10.09 /644.36 seconds. Node 2 communication/computation time: 7.27 /645.03 seconds. Node 3 communication/computation time:165.02 /485.52 seconds. Node 4 communication/computation time: 6.50 /643.82 seconds. Node 5 communication/computation time: 7.80 /627.63 seconds. Computation time:897.00 seconds. 2.6.24.7 (64bit) .. re-evaluated, MPI 1.2.7 5 Iterationen (Core2 Duo) 6 CPU: 131.33 seconds per iteration. Node 0 communication/computation time:364.15 /645.24 seconds. Node 1 communication/computation time:362.83 /645.26 seconds. Node 2 communication/computation time:349.39 /645.07 seconds. Node 3 communication/computation time:508.34 /485.53 seconds. Node 4 communication/computation time:349.94 /643.81 seconds. Node 5 communication/computation time:349.07 /627.47 seconds. Computation time: 1251.00 seconds. The program is 32 bit software, but it doesn't make any difference whether the kernel is 64 or 32 bit. Also the OpenMPI version 1.4.1 was tested, cut communication times by half (which still is too high), but improvement decreased with increasing kernel version number. The communication time is meant to be the time the master process distributes the data portions for calculation and collecting the results from the slave processes. The value also contains times a slave has to wait to communicate with the master as he is occupied. This explains the extended communication time of node #3 as the calculation time is reduced (based on the nature of the data) The command to start the calculation: mpirun -np 2 -host cluster-17 invert-master -b -s -p inv_grav.inp : -np 4 -host cluster-18,cluster-19 Using top (with 'f' and 'j' showing P row) we could track which process runs on which core. We found processes stayed on its initial core in kernel 2.6.23, but started to flip around with 2.6.24. Using the --bind-to-core option in openmpi 1.4.1 kept the processes on its cores again, but that didn't influence the overall outcome, didn't fix the issue. We found top showing ~25% CPU wait time, and processes showing 'D' , also on slave only nodes. According to our programmer communications are only between the master process and its slaves, but not among slaves. On kernel 2.6.23 and lower CPU usage is 100% on user, no wait or system percentage. Example from top: Cpu(s): 75.3%us, 0.6%sy, 0.0%ni, 0.0%id, 23.1%wa, 0.7%hi, 0.3%si, 0.0%st Mem: 8181236k total, 131224k used, 8050012k free,0k buffers Swap:0k total,0k used,0k free,49868k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 3386 oli 20 0 90512 20m 3988 R 74 0.3 12:31.80 0 invert- 3387 oli 20 0 85072 15m 3780 D 67 0.2 11:59.30 1 invert- 3388 oli 20 0 85064 14m 3588 D 77 0.2 12:56.90 2 invert- 3389 oli 20 0 84936 14m 3436 R 85 0.2 13:28.30 3 invert- Some system information that might be helpful: Nodes Hardware: 1. "older": Intel Core2 Duo, (2x1)GB RAM 2. "newer": Intel(R) Core(TM) i5 CPU, Mainboard ASUS RS100-E6, (4x2)GB RAM Debian stable (lenny) distribution with ii libc6 2.7-18lenny2 ii libopenmpi1 1.2.7~rc2-2 ii openmpi-bin 1.2.7~rc2-2 ii openmpi-common1.2.7~rc2-2 Nodes are booting diskless with nfs-root and a kernel with all drivers needed compiled in. Information on the program using openmpi and tools used to compile it: mpirun --version: mpirun (Open MPI) 1.2.7rc2 libopenmpi-dev 1.2.7~rc2-2 depends on: libc6 (2.7-18lenny2) libopenmpi1 (1.2.7~rc2-2) openmpi-common (1.2.7~rc2-2) Compilation command: mpif90 FORTRAN compiler (FC): gfortran --version: GNU Fortran (Debian 4.3.2-1.1) 4.3.2 Called OpenMPI-functions (FORTRAN Bindings): mpi_comm-rank mpi_comm_si
Re: [OMPI users] Problem in remote nodes
I've been having similar problems using Fedora core 9. I believe the issue may be with SELinux, but this is just an educated guess. In my setup, shortly after a login via mpi, there is a notation in the /var/log/messages on the compute node as follows: Mar 30 12:39:45 kernel: type=1400 audit(1269970785.534:588): avc: denied { read } for pid=8047 comm="unix_chkpwd" name="hosts" dev=dm-0 ino=24579 scontext=system_u:system_r:system_chkpwd_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:etc_runtime_t:s0 tclass=file which says SELinux denied unix_chkpwd read access to hosts. Are you getting anything like this? In the meantime, I'll check if allowing unix_chkpwd read access to hosts eliminates the problem on my system, and if it works, I'll post the steps involved. uriz.49...@e.unavarra.es wrote: I've benn investigating and there is no firewall that could stop TCP traffic in the cluster. With the option --mca plm_base_verbose 30 I get the following output: [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 helloworld.out [itanium1:08311] mca: base: components_open: Looking for plm components [itanium1:08311] mca: base: components_open: opening plm components [itanium1:08311] mca: base: components_open: found loaded component rsh [itanium1:08311] mca: base: components_open: component rsh has no register function [itanium1:08311] mca: base: components_open: component rsh open function successful [itanium1:08311] mca: base: components_open: found loaded component slurm [itanium1:08311] mca: base: components_open: component slurm has no register function [itanium1:08311] mca: base: components_open: component slurm open function successful [itanium1:08311] mca:base:select: Auto-selecting plm components [itanium1:08311] mca:base:select:( plm) Querying component [rsh] [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set priority to 10 [itanium1:08311] mca:base:select:( plm) Querying component [slurm] [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. Query failed to return a module [itanium1:08311] mca:base:select:( plm) Selected component [rsh] [itanium1:08311] mca: base: close: component slurm closed [itanium1:08311] mca: base: close: unloading component slurm --Hangs here It seems a slurm problem?? Thanks to any idea El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: Did you configure OMPI with --enable-debug? You should do this so that more diagnostic output is available. You can also add the following to your cmd line to get more info: --debug --debug-daemons --leave-session-attached Something is likely blocking proper launch of the daemons and processes so you aren't getting to the btl's at all. On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: The processes are running on the remote nodes but they don't give the response to the origin node. I don't know why. With the option --mca btl_base_verbose 30, I have the same problems and it doesn't show any message. Thanks On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres wrote: On Mar 17, 2010, at 4:39 AM, wrote: Hi everyone I'm a new Open MPI user and I have just installed Open MPI in a 6 nodes cluster with Scientific Linux. When I execute it in local it works perfectly, but when I try to execute it on the remote nodes with the --host option it hangs and gives no message. I think that the problem could be with the shared libraries but i'm not sure. In my opinion the problem is not ssh because i can access to the nodes with no password You might want to check that Open MPI processes are actually running on the remote nodes -- check with ps if you see any "orted" or other MPI-related processes (e.g., your processes). Do you have any TCP firewall software running between the nodes? If so, you'll need to disable it (at least for Open MPI jobs). I also recommend running mpirun with the option --mca btl_base_verbose 30 to troubleshoot tcp issues. In some environments, you need to explicitly tell mpirun what network interfaces it can use to reach the hosts. Read the following FAQ section for more information: http://www.open-mpi.org/faq/?category=tcp Item 7 of the FAQ might be of special interest. Regards, ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openMPI on Xgrid
i looked at TORQUE and looks good, indeed. i will give it a try for testing i just have some questions, Torque requires moab, but from what i've read on the site you have to buy moab right? im looking for a 100% free solution Cristobal On Mon, Mar 29, 2010 at 3:48 PM, Jody Klymak wrote: > > On Mar 29, 2010, at 12:39 PM, Ralph Castain wrote: > > > On Mar 29, 2010, at 1:34 PM, Cristobal Navarro wrote: > > thanks for the information, > > but is it possible to make it work with xgrid or the 1.4.1 version just > dont support it? > > > FWIW, I've had excellent success with Torque and openmpi on OS-X 10.5 > Server. > > http://www.clusterresources.com/products/torque-resource-manager.php > > It doesn't have a nice dashboard, but the queue tools are more than > adequate for my needs. > > Open MPI had a funny port issue on my setup that folks helped with > > From my notes: > > Edited /Network/Xgrid/openmpi/etc/openmpi-mca-params.conf to make sure > that the right ports are used: > > > # set ports so that they are more valid than the default ones (see email > from Ralph Castain) > btl_tcp_port_min_v4 = 36900 > btl_tcp_port_range = 32 > > > Cheers, Jody > > > -- > Jody Klymak > http://web.uvic.ca/~jklymak/ > > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] openMPI on Xgrid
On Mar 30, 2010, at 11:12 AM, Cristobal Navarro wrote: i just have some questions, Torque requires moab, but from what i've read on the site you have to buy moab right? I am pretty sure you can download torque w/o moab. I do not use moab, which I think is a higher-level scheduling layer on top of pbs. However, there are folks here who would know far more than I do about these sorts of things. Cheers, Jody -- Jody Klymak http://web.uvic.ca/~jklymak/
Re: [OMPI users] openMPI on Xgrid
Jody Klymak wrote: > > On Mar 30, 2010, at 11:12 AM, Cristobal Navarro wrote: > >> i just have some questions, >> Torque requires moab, but from what i've read on the site you have to >> buy moab right? > > I am pretty sure you can download torque w/o moab. I do not use moab, > which I think is a higher-level scheduling layer on top of pbs. > However, there are folks here who would know far more than I do about > these sorts of things. > > Cheers, Jody > Moab is a scheduler, which works with Torque and several other products. Torque comes with a basic scheduler, and Moab is not required. If you want more features but not pay for Moab, you can look at Maui. Craig > -- > Jody Klymak > http://web.uvic.ca/~jklymak/ > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openMPI on Xgrid
Craig Tierney wrote: Jody Klymak wrote: On Mar 30, 2010, at 11:12 AM, Cristobal Navarro wrote: i just have some questions, Torque requires moab, but from what i've read on the site you have to buy moab right? I am pretty sure you can download torque w/o moab. I do not use moab, which I think is a higher-level scheduling layer on top of pbs. However, there are folks here who would know far more than I do about these sorts of things. Cheers, Jody Moab is a scheduler, which works with Torque and several other products. Torque comes with a basic scheduler, and Moab is not required. If you want more features but not pay for Moab, you can look at Maui. Craig Hi Just adding to what Craig and Jody said. Moab is not required for Torque. A small cluster with a few users can work well with the basic Torque/PBS scheduler (pbs_sched), and its first-in-first-out job policy. An alternative is to replace pbs_sched with the free Maui scheduler, if you need fine grained job control. You can install both Torque and Maui from source code (available here http://www.clusterresources.com/), but it takes some work. Some Linux distributions have Torque and Maui available as packages through yum, apt-get, etc. I would guess for the Mac you can get at least Torque through fink, or not? Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - -- Jody Klymak http://web.uvic.ca/~jklymak/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Init never returns on IA64
Hi Jeff, I tested 1.4.2a1r22893, and it does not hang in ompi_free_list_grow. I hadn't noticed that the 1.4.1 installation I was using was configured with --enable-mpi-threads. Could that have been related to this problem? Cheers, Shaun On Mon, 2010-03-29 at 17:00 -0700, Jeff Squyres wrote: > Could you try one of the 1.4.2 nightly tarballs and see if that makes the > issue better? > > http://www.open-mpi.org/nightly/v1.4/ > > > On Mar 29, 2010, at 7:47 PM, Shaun Jackman wrote: > > > Hi, > > > > On an IA64 platform, MPI_Init never returns. I fired up GDB and it seems > > that ompi_free_list_grow never returns. My test program does nothing but > > call MPI_Init. Here's the backtrace: > > > > (gdb) bt > > #0 0x20075620 in ompi_free_list_grow () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #1 0x20078e50 in ompi_rb_tree_init () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #2 0x20160840 in mca_mpool_base_tree_init () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #3 0x2015dac0 in mca_mpool_base_open () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #4 0x200bfd30 in ompi_mpi_init () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #5 0x2010efb0 in PMPI_Init () from > > /home/aubjtl/openmpi/lib/libmpi.so.0 > > #6 0x4b70 in main () > > > > Any suggestion how I can trouble shoot? > > > > $ mpirun --version > > mpirun (Open MPI) 1.4.1 > > $ ./config.guess > > ia64-unknown-linux-gnu > > > > Thanks, > > Shaun > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >
Re: [OMPI users] MPI_Init never returns on IA64
On Mar 30, 2010, at 3:15 PM, Shaun Jackman wrote: > Hi Jeff, > > I tested 1.4.2a1r22893, and it does not hang in ompi_free_list_grow. > > I hadn't noticed that the 1.4.1 installation I was using was configured > with --enable-mpi-threads. Could that have been related to this problem? Yes, very definitely. IBM has been making some good progress in fixing thread-related things in 1.4.2 (and beyond). -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Problem in remote nodes
I changed the SELinux config to permissive (log only), and it didn't change anything. Back to the drawing board. Robert Collyer wrote: I've been having similar problems using Fedora core 9. I believe the issue may be with SELinux, but this is just an educated guess. In my setup, shortly after a login via mpi, there is a notation in the /var/log/messages on the compute node as follows: Mar 30 12:39:45 kernel: type=1400 audit(1269970785.534:588): avc: denied { read } for pid=8047 comm="unix_chkpwd" name="hosts" dev=dm-0 ino=24579 scontext=system_u:system_r:system_chkpwd_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:etc_runtime_t:s0 tclass=file which says SELinux denied unix_chkpwd read access to hosts. Are you getting anything like this? In the meantime, I'll check if allowing unix_chkpwd read access to hosts eliminates the problem on my system, and if it works, I'll post the steps involved. uriz.49...@e.unavarra.es wrote: I've benn investigating and there is no firewall that could stop TCP traffic in the cluster. With the option --mca plm_base_verbose 30 I get the following output: [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 helloworld.out [itanium1:08311] mca: base: components_open: Looking for plm components [itanium1:08311] mca: base: components_open: opening plm components [itanium1:08311] mca: base: components_open: found loaded component rsh [itanium1:08311] mca: base: components_open: component rsh has no register function [itanium1:08311] mca: base: components_open: component rsh open function successful [itanium1:08311] mca: base: components_open: found loaded component slurm [itanium1:08311] mca: base: components_open: component slurm has no register function [itanium1:08311] mca: base: components_open: component slurm open function successful [itanium1:08311] mca:base:select: Auto-selecting plm components [itanium1:08311] mca:base:select:( plm) Querying component [rsh] [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set priority to 10 [itanium1:08311] mca:base:select:( plm) Querying component [slurm] [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. Query failed to return a module [itanium1:08311] mca:base:select:( plm) Selected component [rsh] [itanium1:08311] mca: base: close: component slurm closed [itanium1:08311] mca: base: close: unloading component slurm --Hangs here It seems a slurm problem?? Thanks to any idea El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: Did you configure OMPI with --enable-debug? You should do this so that more diagnostic output is available. You can also add the following to your cmd line to get more info: --debug --debug-daemons --leave-session-attached Something is likely blocking proper launch of the daemons and processes so you aren't getting to the btl's at all. On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: The processes are running on the remote nodes but they don't give the response to the origin node. I don't know why. With the option --mca btl_base_verbose 30, I have the same problems and it doesn't show any message. Thanks On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres wrote: On Mar 17, 2010, at 4:39 AM, wrote: Hi everyone I'm a new Open MPI user and I have just installed Open MPI in a 6 nodes cluster with Scientific Linux. When I execute it in local it works perfectly, but when I try to execute it on the remote nodes with the --host option it hangs and gives no message. I think that the problem could be with the shared libraries but i'm not sure. In my opinion the problem is not ssh because i can access to the nodes with no password You might want to check that Open MPI processes are actually running on the remote nodes -- check with ps if you see any "orted" or other MPI-related processes (e.g., your processes). Do you have any TCP firewall software running between the nodes? If so, you'll need to disable it (at least for Open MPI jobs). I also recommend running mpirun with the option --mca btl_base_verbose 30 to troubleshoot tcp issues. In some environments, you need to explicitly tell mpirun what network interfaces it can use to reach the hosts. Read the following FAQ section for more information: http://www.open-mpi.org/faq/?category=tcp Item 7 of the FAQ might be of special interest. Regards, ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/
[OMPI users] Best way to reduce 3D array
Hi all, I posted before about doing a domain decomposition on a 3D array in C, and this is sort of a follow up to that. I was able to get the calculations working correctly by performing the calculations on XZ sub-domains for all Y dimensions of the space. I think someone referred to this as a "book." In the space. Being that I now have an X starting and ending point, a Z starting and ending point, and a total number of X and Z points to visit in each direction during the computation, I am now at another hanging point. First, some background. I am working on modifying a code that was originally written to be run serially. That being said, there is a massive amount of object oriented crap that is making this a total nightmare to work on. All of the properties that are computed for each point in the 3D mesh are stored in structures, and those structures are stored in structures, blah blah, it looks very gross. In order to speed this code up, I was able to pull out the most computationally sensitive property (potential) and get it set up in this 3D array that is allocated nicely, etc. The problem is, this code eventually outputs after all the iterations to a Tecplot format. The code to do this is very, very contrived. My idea was to, for the sake of wanting to move on, stuff back all of these XZ subdomains that I have calculated into a single array on the first processor, so it can go about its way and do the file output on the WHOLE domain. I seem to be having problems though, extracting out these SubX * SubZ * Y sized portions of the original that can be sent to the first processor. Does anyone have any examples anywhere of code that does something like that? It appears that my 3D mesh is in X major format in memory, so I tried to create some loops to extract Y, SubZ sized columns of X to send back to the zero'th processor but I haven't had much luck yet. Any tips are appreciated...thanks!
Re: [OMPI users] Best way to reduce 3D array
Hi Derek Great to read that you parallelized the code. Sorry to hear about the OO problems, although I enjoyed to read your characterization of it. :) We also have plenty of that, mostly with some Fortran90 codes that go OOverboard. I think I suggested "YZ-books", i.e., decompose the domain across X, which I guess would take advantage of the C array "row major order", and obviate the need for creating MPI vector types. However, I guess your choice really depends on how your data is laid out in memory. I am not sure if I understood the I/O (output) problem you described. However, here is a suggestion. I think I sent it in a previous email. It assumes the global array fits rank 0/master process memory: A) To input data (at the beginning) , rank 0 can read the all the data from a file to a big buffer/global array, then all processes call MPI_Scatter[v], which distributes the subarrays to all ranks/slave processes; B) To output data (at the end), all processes call MPI_Gather[v], which allows rank 0/master to collect the final results on a big buffer/global array, and then rank 0 does the output to a file (and in your case, also converts to "Tecplot", I suppose). If your domain decomposition took advantage of the array layout in memory, each process can do a single call to MPI_Scatter and/or to MPI_Gather[v] to do the job. All you need know is the pointer to the first element of the (sub)array and its size (and for the global array on rank0/master). If the domain decomposition cuts across the array memory layout, you may need to define an MPI vector type, with strides, etc, and use it in the MPI functions above, which again can be called only once. With MPI type vector it is a bit more work and bookkeeping, but not too hard. This master/slave I/O pattern is quite common, and admittedly old fashioned, since it doesn't take advantage of MPI-IO. However, it is a reliable workhorse, particularly if you have a plain NFS mounted file system (as opposed to a parallel file system). I hope this helps. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - Cole, Derek E wrote: Hi all, I posted before about doing a domain decomposition on a 3D array in C, and this is sort of a follow up to that. I was able to get the calculations working correctly by performing the calculations on XZ sub-domains for all Y dimensions of the space. I think someone referred to this as a “book.” In the space. Being that I now have an X starting and ending point, a Z starting and ending point, and a total number of X and Z points to visit in each direction during the computation, I am now at another hanging point. First, some background. I am working on modifying a code that was originally written to be run serially. That being said, there is a massive amount of object oriented crap that is making this a total nightmare to work on. All of the properties that are computed for each point in the 3D mesh are stored in structures, and those structures are stored in structures, blah blah, it looks very gross. In order to speed this code up, I was able to pull out the most computationally sensitive property (potential) and get it set up in this 3D array that is allocated nicely, etc. The problem is, this code eventually outputs after all the iterations to a Tecplot format. The code to do this is very, very contrived. My idea was to, for the sake of wanting to move on, stuff back all of these XZ subdomains that I have calculated into a single array on the first processor, so it can go about its way and do the file output on the WHOLE domain. I seem to be having problems though, extracting out these SubX * SubZ * Y sized portions of the original that can be sent to the first processor. Does anyone have any examples anywhere of code that does something like that? It appears that my 3D mesh is in X major format in memory, so I tried to create some loops to extract Y, SubZ sized columns of X to send back to the zero’th processor but I haven’t had much luck yet. Any tips are appreciated…thanks! ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Best way to reduce 3D array
If using the master/slace IO model, would it be better to cicle through all the process and each one would write it's part of the array into the file. This file would be open in "stream" mode... like do p=0,nprocs-1 if(my_rank.eq.i)then openfile (append mode) write_to_file closefile endif call MPI_Barrier(world,ierr) enddo cheers, Ricardo Reis 'Non Serviam' PhD candidate @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt Cultural Instigator @ Rádio Zero http://www.radiozero.pt Keep them Flying! Ajude a/help Aero Fénix! http://www.aeronauta.com/aero.fenix http://www.flickr.com/photos/rreis/ < sent with alpine 2.00 >
Re: [OMPI users] Best way to reduce 3D array
Salve Ricardo Reis! Como vai a Radio Zero? Doesn't this serialize the I/O operation across the processors, whereas MPI_Gather followed by rank_0 I/O may perhaps move the data faster to rank_0, and eventually to disk (particularly when the number of processes is large)? I never thought of your solution, hence I never tried/tested/compared it to my common wisdom suggestion to Derek either. So, I really don't know the answer. Abrac,o Gus Ricardo Reis wrote: If using the master/slace IO model, would it be better to cicle through all the process and each one would write it's part of the array into the file. This file would be open in "stream" mode... like do p=0,nprocs-1 if(my_rank.eq.i)then openfile (append mode) write_to_file closefile endif call MPI_Barrier(world,ierr) enddo cheers, Ricardo Reis 'Non Serviam' PhD candidate @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt Cultural Instigator @ Rádio Zero http://www.radiozero.pt Keep them Flying! Ajude a/help Aero Fénix! http://www.aeronauta.com/aero.fenix http://www.flickr.com/photos/rreis/ < sent with alpine 2.00 > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users