Re: [OMPI users] busy waiting and oversubscriptions

2014-03-26 Thread Gus Correa
ueue and a first-in-first-out job policy, then make it more complex as the workload increases. Queue systems do support interactive jobs (even with X-windows GUIs, if needed). You submit the interactive job, the queue system puts you in a free node, and you work normally there. I hope this helps, Gus Correa

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Gus Correa
bing is bad, hence my mindset. Gus Correa

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Gus Correa
hardly learn things like that from the mpiexec man page alone, although it has very good examples. Thank you, Gus Correa <\end hijacking of this thread> On 03/27/2014 11:38 AM, Saliya Ekanayake wrote: Thank you, this is really helpful. Saliya On Thu, Mar 27, 2014 at 5:11 AM, mailto:tmish.

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
friend! I hope this helps, Gus Correa On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote: When a piece of software built against OpenMPI fails, I will see an error referring to the rank of the MPI task which incurred the failure. For example: MPI_ABORT was invoked o

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
this FAQ: http://www.open-mpi.org/faq/?category=tuning#setting-mca-params Again, the OMPI FAQ page is your friend! :) http://www.open-mpi.org/faq/ I hope this helps, Gus Correa On 03/27/2014 02:06 PM, Gus Correa wrote: Hi John Take a look at the mpiexec/mpirun options: -report-bindings (th

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
-core would cause an unexpected problem I did not account for. --john Well, testing and failing is part of this game! Would the GE manager buy that? :) I hope this helps, Gus Correa -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Th

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
splay-map goes to stdout, whereas -report-bindings goes to stderr, right?) Thanks, Ralph! Gus Correa Sent from my iPhone On Mar 27, 2014, at 11:47 AM, Gus Correa wrote: PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none, in which case -report-bindings won't report anything. So,

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
On 03/27/2014 04:10 PM, Reuti wrote: Hi, Am 27.03.2014 um 20:15 schrieb Gus Correa: Awesome, but now here is my concern. If we have OpenMPI-based applications launched as batch jobs via a batch scheduler like SLURM, PBS, LSF, etc. (which decides the placement of the app and dispatches it

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
: man pages, README file, FAQ, and this rocking mailing list rhythm, (Who can ask for anything more?), I think I found what seems to be the corresponding mca parameter: rmaps_base_display_map which defaults to 0, but should be set to 1 to produce the same effect of mpiexec --display-map. Right? Cheers, Gus Correa

Re: [OMPI users] openmpi query

2014-04-08 Thread Gus Correa
On 04/08/2014 06:37 AM, Jeff Squyres (jsquyres) wrote: You should ping the Rocks maintainers and ask them to upgrade. Open MPI 1.4.3 was released in September of 2010. On Rocks, you can install OpenMPI from source (and any other software application by the way) on their standard NFS shared d

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa
lementation? I hope this helps, Gus Correa

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa
plementation of this mailing list (distributed memory, so to speak, althouth intra-node it is shared mem). My guess is that your intent is to compile with MPI, right? And actually with OpenMPI, i.e., with this implementation of MPI, right? What is the output of "ldd ./wrf.exe"? This may show

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa
e command that generates the configure script that you sent before? Maybe the full command line will shed some light on the problem. I hope this helps, Gus Correa On 04/14/2014 03:11 PM, Djordje Romanic wrote: to get help :) On Mon, Apr 14, 2014 at 3:11 PM, Djordje Romanic mailto:djord...@gma

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-14 Thread Gus Correa
you have access to? If it is a cluster, do you have access to a filesystem that is shared across the cluster? On clusters typically /home is shared, often via NFS. Gus Correa On 04/14/2014 05:15 PM, Jeff Squyres (jsquyres) wrote: Maybe we should rename OpenMP to be something less confusing -- per

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-15 Thread Gus Correa
nt to items in /home/djordje/sw/openmpi/1.8/bin 5. Rebuild WRF from scratch. 6. Check if WRF got the libraries right: ldd wrf.exe This should show mpi libraries in /home/djordje/sw/openmpi/1.8/lib 7. Run WRF mpirun -np 4 wrf.exe I hope this helps, Gus Correa On 04/14/2014 08:21 PM, Djordje Ro

Re: [OMPI users] mpirun runs in serial even I set np to several processors

2014-04-15 Thread Gus Correa
just for fun! :) Best, Gus Correa *** PS: Parallel computing, MPI, and OpenMP, tutorials at LLNL: https://computing.llnl.gov/tutorials/parallel_comp/ https://computing.llnl.gov/tutorials/mpi/ https://computing.llnl.gov/tutorials/openMP/ Ch. 5 in the first tutorial gives an outline of the

Re: [OMPI users] Where is the error? (MPI program in fortran)

2014-04-15 Thread Gus Correa
Or just compiling with -g or -traceback (depending on the compiler) will give you more information about the point of failure in the error message. On 04/15/2014 04:25 PM, Ralph Castain wrote: Have you tried using a debugger to look at the resulting core file? It will probably point you right at

Re: [OMPI users] Where is the error? (MPI program in fortran)

2014-04-16 Thread Gus Correa
is normally available on Linux (or you can install it with yum, apt-get, etc). An alternative is ddd, with a GUI (can also be installed from yum, etc). If you use a commercial compiler you may have a debugger with a GUI. Enviado desde mi iPad El 15/04/2014, a las 18:20, "Gus Correa"

Re: [OMPI users] Where is the error? (MPI program in fortran)

2014-04-16 Thread Gus Correa
d to ask your system administrator to do it. I hope this helps, Gus Correa On 04/16/2014 11:24 AM, Gus Correa wrote: On 04/16/2014 08:30 AM, Oscar Mojica wrote: How would be the command line to compile with the option -g ? What debugger can I use? Thanks Replace any optimization flags (-O2, or similar)

Re: [OMPI users] Where is the error? (MPI program in fortran)

2014-04-17 Thread Gus Correa
n the program code and replace them by explicit declarations (and add "implicit none" to all program units, to play safe). Implicit variable declarations are a big source of bugs. I hope this helps, Gus Correa PS - If you are at UFBA, send my hello to Milton Porsani, please. On 04/1

Re: [OMPI users] users Digest, Vol 2881, Issue 4

2014-05-07 Thread Gus Correa
On 05/06/2014 09:49 PM, Ralph Castain wrote: On May 6, 2014, at 6:24 PM, Clay Kirkland mailto:clay.kirkl...@versityinc.com>> wrote: Got it to work finally. The longer line doesn't work. 192.168.0.0/1 But if I take off the -mca oob_tcp_if_include 192.168.0.0/16 part th

Re: [OMPI users] Question about scheduler support

2014-05-14 Thread Gus Correa
it took to configure it with Torque support was to point configure to the Torque installation directory (which is non-standard in my case): --with-tm=/opt/torque/bla/bla My two cents, Gus Correa

Re: [OMPI users] Question about scheduler support

2014-05-15 Thread Gus Correa
says "no" is the default. This covers pretty much all free/open source schedulers, correct me if I am wrong, please. LSF seems not to have a clearly documented default also. But LSF is for the rich. I am out. My 2 cents, 2nd edition, out of print. Bye, thanks, regards. Gus Correa On 0

Re: [OMPI users] Question about scheduler support

2014-05-16 Thread Gus Correa
venient to the final user? Quite frankly this is the first time I see so much fuss about OMPI's build system. Gus Correa On 5/16/2014 3:00 PM, Martin Siegert wrote: +1 even if cmake would make life easier for the developpers, you may want to consider those sysadmins/users who actually

Re: [OMPI users] openmpi configuration error?

2014-05-16 Thread Gus Correa
ed/1.4.4-intel/lib:$LD_LIBRARY_PATH if csh to .cshrc setenv PATH /opt/apps/openmpi/retired/1.4.4-intel/bin:$PATH setenv LD_LIBRARY_PATH /opt/apps/openmpi/retired/1.4.4-intel/lib:$LD_LIBRARY_PATH I hope this helps, Gus Correa On 05/16/2014 05:39 PM, Ben Lash wrote: My cluster has just upgrad

Re: [OMPI users] openmpi configuration error?

2014-05-16 Thread Gus Correa
red" directory, then it is probably out of date. Why don't you try to recompile the code with the current Open MPI installed in the cluster? module avail will show everyting, and you can pick the latest, load it, and try to recompile the program with that. Gus Correa On Fri, M

Re: [OMPI users] openmpi configuration error?

2014-05-16 Thread Gus Correa
he code. (Probably just module swap openmpi/1.4.4-intel openmpi/1.6.5-intel) You may need to tweak with the Makefile, if it hardwires the MPI wrappers/binary location, or the library and include paths. Some do, some don't. Gus Correa [bl10@login2 ~]$ echo $PATH /home/bl10/rlib/deps/bin:/

Re: [OMPI users] openmpi configuration error?

2014-05-21 Thread Gus Correa
enmpi/1.6.5 should have been marked to conflict with 1.4.4. Is it? Anyway, you may want to do a 'which mpiexec' to see which one is taking precedence in your environment (1.6.5 or 1.4.4) Probably 1.6.5. Does the code work now, or does it continue to fail? I hope this helps, Gus Correa

Re: [OMPI users] openmpi configuration error?

2014-05-21 Thread Gus Correa
nknown Unknown CCTM_V5g_Linux2_x 007FD3A0 Unknown Unknown Unknown CCTM_V5g_Linux2_x 007BA9A2 Unknown Unknown Unknown CCTM_V5g_Linux2_x 00759288 Unknown Unknown Unknown ... On Wed, May 21, 2014 at 2:08 PM, Gus Correa mailto:g...@ldeo.c

Re: [OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-04 Thread Gus Correa
ferred transport layer for intra-node communication. Gus Correa On 06/04/2014 11:13 AM, Ralph Castain wrote: Thanks!! Really appreciate your help - I'll try to figure out what went wrong and get back to you On Jun 4, 2014, at 8:07 AM, Fischer, Greg A. mailto:fisch...@westinghouse.com>> wrote

Re: [OMPI users] Determining what parameters a scheduler passes to OpenMPI

2014-06-06 Thread Gus Correa
r" in Torque parlance). This mostly matter if there is more than one job running on a node. However, Torque doesn't bind processes/MPI_ranks to cores or sockets or whatever. As Ralph said, Open MPI does that. I believe Open MPI doesn't use the cpuset info from Torque. (Ralph, pl

Re: [OMPI users] openib segfaults with Torque

2014-06-11 Thread Gus Correa
iexec, etc), to inherit those limits. Or not? Gus Correa On 06/11/2014 06:20 PM, Jeff Squyres (jsquyres) wrote: +1 On Jun 11, 2014, at 6:01 PM, Ralph Castain wrote: Yeah, I think we've seen that somewhere before too... On Jun 11, 2014, at 2:59 PM, Joshua Ladd wrote: Agreed. The

Re: [OMPI users] Problem moving from 1.4 to 1.6

2014-06-27 Thread Gus Correa
he only cause of the problem. If you want to use openib switch to --mca btl openib,sm,self Another thing to check is whether there is a mixup of enviroment variables, PATH and LD_LIBRARY_PATH perhaps pointing to the old OMPI version you may have installed. My two cents, Gus Correa On 06/

Re: [OMPI users] Problem moving from 1.4 to 1.6

2014-06-27 Thread Gus Correa
ewise, env |grep PATH and env |grep LD_LIBRARY_PATH may hint if you have a mixed environment and mixed MPI implementations and versions. I hope this helps, Gus Correa PS - BTW, unless your company's policies forbid, you can install OpenMPI on a user directory, say, your /home directory.

Re: [OMPI users] configure fails to detect missing libcrypto

2014-07-24 Thread Gus Correa
lem. Could your libcrypto be in an an unusual location? Maybe you need to load a Torque environment module to add it to your LD_LIBRARY_PATH before you build OMPI? Gus Correa On 07/24/2014 05:18 PM, Jeff Hammond wrote: That could be the case. I've reported the missing libcrypto issue to NERSC

Re: [OMPI users] Trying to use openmpi with MOM getting a compile error

2014-07-25 Thread Gus Correa
(e.g. to MPICH libraries and include files). Then rebuild the Makefile and compile MOM again. I hope this helps. Gus Correa On 07/25/2014 12:37 PM, Dan Shell wrote: OpenMOM-mpi I am trying to compile MOM and have installed openmpi 1.8.1 getting an installation error below Looking for some help

Re: [OMPI users] Trying to use openmpi with MOM getting a compile error

2014-07-25 Thread Gus Correa
On 07/25/2014 03:02 PM, Jeff Squyres (jsquyres) wrote: On Jul 25, 2014, at 1:14 PM, Gus Correa wrote: Change the mkmf.template file and replace the Fortran compiler name (gfortran) by the Open MPI (OMPI) Fortran compiler wrapper: mpifortran (or mpif90 if it still exists in OMPI 1.8.1

Re: [OMPI users] mpifort wrapper.txt

2014-07-29 Thread Gus Correa
No underlying compiler was specified in the wrapper compiler data file (e.g., mpicc-wrapper-data.txt) The error message is complaining about mpicc, not mpifort. I wonder if this may be due to a Makefile misconfiguration again. My two cents, Gus Correa

Re: [OMPI users] Configuring openib on openmpi 1.8.1

2014-07-30 Thread Gus Correa
not be mixed. The OMPI implementations should be the same on all machines as well. Running "which mpirun" on those machines may help. These user enviroment problems often cause confusion. My two cents, Gus Correa On 07/30/2014 09:56 AM, Ralph Castain wrote: Does "polaris" ha

Re: [OMPI users] openmpi 1.8.1 gfortran not working

2014-08-04 Thread Gus Correa
ix? (CC, CXX, FC) Then "make distclean; configure; make; make install". Gus Correa On 08/04/2014 04:10 PM, Dan Shell wrote: Ralph Ok I will give that a try Thanks Dan Shell -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Mo

Re: [OMPI users] How to keep multiple installations at same time

2014-08-05 Thread Gus Correa
stall from each of these dirctories, using the appropriate compilers, and pointing to two distinct *installation directories* (with configure -prefix). My two cents, Gus Correa On 08/04/2014 11:54 PM, Andrew Caird wrote: Hi Ahsan, We, and I think many people, use the Environment Modules sof

Re: [OMPI users] How to keep multiple installations at same time

2014-08-05 Thread Gus Correa
e the same exact thing that they currently have, and in the end gain little if any relevant/useful/new functionality. My two cents of opinion Gus Correa On 08/05/2014 12:54 PM, Ralph Castain wrote: Check the repo - hasn't been touched in a very long time On Aug 5, 2014, at 9:42 AM, Fabric

Re: [OMPI users] How to keep multiple installations at same time

2014-08-05 Thread Gus Correa
eed/want, is a pain. Anyway, this is the OMPI list, not a place for advocacy of either package, so I am going to stop here. I just wanted to set the record straight that: - the Enviroment Modules package is not dead, - it has a large user base, and - it is sooo good that among other things it opened

Re: [OMPI users] Newbie query - mpirun will not run if it's previously been killed with Control-C

2014-08-07 Thread Gus Correa
I guess Control-C will kill only the mpirun process. You may need to kill the (two) jules.exe processes separately, say, with kill -9. ps -u "yourname" will show what you have running. On 08/07/2014 11:16 AM, Jane Lewis wrote: Hi all, This is a really simple problem (I hope) where I’ve introdu

Re: [OMPI users] Newbie query - mpirun will not run if it's previously been killed with Control-C

2014-08-07 Thread Gus Correa
On 08/07/2014 11:28 AM, Gus Correa wrote: I guess Control-C will kill only the mpirun process. You may need to kill the (two) jules.exe processes separately, say, with kill -9. ps -u "yourname" will show what you have running. Something may have been left behind by Control-C, a

Re: [OMPI users] Newbie query - mpirun will not run if it's previously been killed with Control-C

2014-08-07 Thread Gus Correa
On 08/07/2014 11:49 AM, Ralph Castain wrote: On Aug 7, 2014, at 8:47 AM, Reuti mailto:re...@staff.uni-marburg.de>> wrote: Am 07.08.2014 um 17:28 schrieb Gus Correa: I guess Control-C will kill only the mpirun process. You may need to kill the (two) jules.exe processes separately, say

Re: [OMPI users] building openmpi 1.8.1 with intel 14.0.1

2014-08-21 Thread Gus Correa
Hi Peter If I remember right from my compilation of OMPI on a Mac years ago, you need to have X-Code installed, in case you don't. If vampir-trace is the only problem, you can disable it when you configure OMPI (--disable-vt). My two cents, Gus Correa On 08/21/2014 03:35 PM, Bosler,

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
Was the error that you listed the *first* error? Apparently various object files are missing from the ../../Modules/ directory, and were not compiled, suggesting something is amiss even before the compilation of the executable (epw.x). On 09/03/2014 05:20 PM, Elio Physics wrote: Dear all, I am

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
? Do they have a mailing list or bulletin board where you could get specific help for their software? (Either on EPW or on QuantumExpresso (which seems to be required): http://www.quantum-espresso.org/) That would probably be the right forum to ask your questions. My two cents, Gus Correa On

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
the recipe on the EPW web site? http://epw.org.uk/Main/DownloadAndInstall ** I hope this helps, Gus Correa On 09/03/2014 06:48 PM, Elio Physics wrote: I have already done all of the steps you mentioned. I have installed the older version of quantum espresso, configured it and followed all the

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
nd top EPW directory (which per the recipe is right below the top QE) plays a role. Anyway, phonons are not my playground, just trying to help two-cent-wise, although this is not really an MPI or OpenMPI issue, more or a Makefile/configure issue specific to QE and EPW. Thanks, Gus Correa On 09/03/201

Re: [OMPI users] compilation problem with ifort

2014-09-04 Thread Gus Correa
arts of QE that it needs. And this is *exactly what the error message in your first email showed*, a bunch of object files that were not found. *** Sorry, but I cannot do any better than this. I hope this helps, Gus Correa On 09/03/2014 08:59 PM, Elio Physics wrote: Ray and Gus, Thanks a lot for

Re: [OMPI users] compilation problem with ifort

2014-09-04 Thread Gus Correa
libraries (blas, lapack, fft) and to build them. At least that is what seems to have happened on my computer. So, I don't think you need any other libraries. Good luck, Gus Correa On 09/04/2014 04:17 PM, Elio Physics wrote: Dear Gus, Firstly I really need to thank you for the effort you are

Re: [OMPI users] About debugging and asynchronous communication

2014-09-18 Thread Gus Correa
There is no guarantee that the messages will be received in the same order that they were sent. Use tags or another mechanism to match the messages on send and recv ends. On 09/18/2014 10:42 AM, XingFENG wrote: I have found some thing strange. Basically, in my codes, processes send and receive

Re: [OMPI users] General question about running single-node jobs.

2014-10-02 Thread Gus Correa
H and $OMPI/lib to LD_LIBRARY_PATH and are these environment variables propagated to the job execution nodes (specially those that are failing)? Anyway, just a bunch of guesses ... Gus Correa * QCSCRATCH Defines the directory in which Q-Chem will

[OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-14 Thread Gus Correa
old (1.6) OMPI runtime parameters, and/or any additional documentation about the new style of OMPI 1.8 runtime parameters? Since there seems to have been a major revamping of the OMPI runtime parameters, that would be a great help. Thank you, Gus Correa

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-15 Thread Gus Correa
any thanks, Gus Correa On 10/15/2014 11:12 AM, Jeff Squyres (jsquyres) wrote: We talked off-list -- fixed this on master and just filed https://github.com/open-mpi/ompi-release/pull/33 to get this into the v1.8 branch. On Oct 14, 2014, at 7:39 PM, Ralph Castain wrote: On Oct 14, 2014,

Re: [OMPI users] Hybrid OpenMPI/OpenMP leading to deadlocks?

2014-10-16 Thread Gus Correa
codes + short job queue time policy is very common out there. Here most problems with long runs (we have some non-restartable serial code die-hards), happen due to NFS issues (busy, slow response, etc), and code with poorly designed IO. My two cents, Gus Correa On 10/16/2014 10:16 AM, McGrattan, Kevin

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-16 Thread Gus Correa
and mpiexec options: -bind-to-core, rmaps_base_schedule_policy, orte_process_binding, etc. Thank you, Gus Correa On 10/15/2014 11:10 PM, Ralph Castain wrote: On Oct 15, 2014, at 11:46 AM, Gus Correa mailto:g...@ldeo.columbia.edu>> wrote: Thank you Ralph and Jeff for the help! Glad to hear t

[OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
t there was no trace of knem in sderr/stdout of either 1.6.5 or 1.8.3. So, the evidence I have that knem is active in 1.6.5 but not in 1.8.3 comes only from the statistics in /dev/knem. *** Thank you, Gus Correa *** PS - As an aside, I also have some questions on the knem setup, which I mos

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
t;btl = ^tcp,^vader" ? I am in CentOS 6.5, stock kernel 2.6.32, no 3.1,no CMA linux, so I believe I need knem for now. I tried '-mca btl_base_verbose 30' but no knem information came out. Many thanks, Gus Correa On 10/16/2014 04:40 PM, Aurélien Bouteiller wrote: Are you sure you are

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
openib, etc)? How does it affect knem? What are vader's pros/cons w.r.t. using the other btls? In which conditions is it good or bad to use it vs. the other btls? What do I gain/lose if I do "btl = sm,self,openib" (which presumably will knock off tcp and "vader'), or maybe &q

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
ed to keep their MPI applications running in production mode, hopefully with Open MPI 1.8, can somebody explain more clearly what "vader" is about? Thank you, Gus Correa On Thu, Oct 16, 2014 at 01:49:09PM -0700, Ralph Castain wrote: FWIW: vader is the default in 1.8 On Oct 16, 2014,

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
ers like these don't give me any incentive to upgrade our production codes to OMPI 1.8. Will this be fixed in the next Open MPI 1.8 release? Thank you, Gus Correa PS - Many thanks to Aurelien Boutelier for pointing out the existence of the vader btl. Without his tip I would still be in the d

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
On 10/16/2014 05:38 PM, Nathan Hjelm wrote: On Thu, Oct 16, 2014 at 05:27:54PM -0400, Gus Correa wrote: Thank you, Aurelien! Aha, "vader btl", that is new to me! I tought Vader was that man dressed in black in Star Wars, Obi-Wan Kenobi's nemesis. That was a while ago, my kid

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
On Oct 16, 2014, at 4:06 PM, Gus Correa wrote: Hi All Back to the original issue of knem in Open MPI 1.8.3. It really seems to be broken. I launched the Intel MPI benchmarks (IMB) job both with '-mca btl ^vader,tcp', and with '-mca btl sm,self,openib'. Both syntaxes seem

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
ix=${MYINSTALLDIR} \ --with-tm=/opt/torque/4.2.5/gnu-4.4.7 \ --with-verbs=/usr \ --with-knem=/opt/knem-1.1.1 \ 2>&1 | tee configure_${build_id}.log Many thanks, Gus On Oct 16, 2014, at 4:24 PM, Gus Correa wrote: On 10/16/2014 05:38 PM, Nathan Hjelm wrote: On Thu, Oct 16, 2014 at 05:2

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-17 Thread Gus Correa
1.8 process placement conceptual model, along with its syntax and examples. Thank you, Gus Correa On 10/17/2014 12:10 AM, Ralph Castain wrote: I know this commit could be a little hard to parse, but I have updated the mpirun man page on the trunk and will port the change over to the 1.8 series

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Gus Correa
along automatically) * -mca btl openib,self (and vader will come along automatically) * -mca btl openib,self,vader (because vader is default only for 1-node jobs) * something else (or several alternatives) Whatever happened to the "self" btl in this new context? Gone? Still there? Many thanks

Re: [OMPI users] New ib locked pages behavior?

2014-10-21 Thread Gus Correa
Hi Bill Maybe you're missing these settings in /etc/modprobe.d/mlx4_core.conf ? http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem I hope this helps, Gus Correa On 10/21/2014 06:36 PM, Bill Broadley wrote: I've setup several clusters over the years with OpenMPI. I

Re: [OMPI users] New ib locked pages behavior?

2014-10-21 Thread Gus Correa
apparently no solution): http://www.open-mpi.org/community/lists/users/2013/02/21430.php Maybe Mellanox has more information about this? Gus Correa On 10/21/2014 08:15 PM, Bill Broadley wrote: On 10/21/2014 04:18 PM, Gus Correa wrote: Hi Bill Maybe you're missing these settings in

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
with the btl_vader_single_copy_mechanism parameter? Or must OMPI be configured with only one memory copy mechanism? Many thanks, Gus Correa On 10/30/2014 05:44 PM, Nathan Hjelm wrote: I want to close the loop on this issue. 1.8.5 will address it in several ways: - knem support in btl/sm has been fixed. A san

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
questions below (specially the 12 vader parameters). Many thanks, Gus Correa On Oct 30, 2014, at 4:24 PM, Gus Correa wrote: Hi Nathan Thank you very much for addressing this problem. I read your notes on Jeff's blog about vader, and that clarified many things that were obscure to me w

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-12 Thread Gus Correa
east I think they are sensible. :) Cheers, Gus Correa It tries so independent from the internal or external name of the headnode given in the machinefile - I hit ^C then. I attached the output of Open MPI 1.8.1 for this setup too. -- Reuti ___ users m

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-13 Thread Gus Correa
Hi Reuti See below, please. On 11/13/2014 07:19 AM, Reuti wrote: Gus, Am 13.11.2014 um 02:59 schrieb Gus Correa: On 11/12/2014 05:45 PM, Reuti wrote: Am 12.11.2014 um 17:27 schrieb Reuti: Am 11.11.2014 um 02:25 schrieb Ralph Castain: Another thing you can do is (a) ensure you built

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-13 Thread Gus Correa
:) ) My vote (... well, I don't have voting rights on that, but I'll vote anyway ...) is to keeep the current approach. It is wise and flexible, and easy to adjust and configure to specific machines with their own oddities, via MCA parameters, as I tried to explain in previous postings.

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gus Correa
number of open files is yet another hurdle. And if you're using Infinband, the max locked memory size should be unlimited. Check /etc/security/limits.conf and "ulimit -a". I hope this helps, Gus Correa On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote: Luca, your email mention

Re: [OMPI users] Icreasing OFED registerable memory

2014-12-30 Thread Gus Correa
egory=openfabrics#ib-locked-pages-more http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem *** Having said that, a question remains unanswered: Why is Infiniband such a nightmare? *** I hope this helps, Gus Correa On 12/30/2014 09:16 AM, Waleed Lotfy wrote: Thank Devendar for your response.

Re: [OMPI users] Icreasing OFED registerable memory

2015-01-06 Thread Gus Correa
t I sent you before for more details. I hope this helps, Gus Correa On 01/06/2015 01:37 PM, Deva wrote: Hi Waleed, -- Memlock limit: 65536 -- such a low limit should be due to per-user lock memory limit . Can you make sure it is set to "unlimited" on all nodes ( &qu

Re: [OMPI users] libpsm_infinipath issues?

2015-01-08 Thread Gus Correa
Hi Michael, Andrew, list knem is doesn't work in OMPI 1.8.3. See this thread: http://www.open-mpi.org/community/lists/users/2014/10/25511.php A fix was promised on OMPI 1.8.4: http://www.open-mpi.org/software/ompi/v1.8/ Have you tried it? I hope this helps, Gus Correa On 01/08/2015 04:

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-08 Thread Gus Correa
uggested a while back. I hope this helps, Gus Correa Thanks again Diego On 8 January 2015 at 23:24, George Bosilca mailto:bosi...@icl.utk.edu>> wrote: Diego, Please find below the corrected example. There were several issues but the most important one, which is certainly

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-08 Thread Gus Correa
Hi Diego *EITHER* declare your QQ and PR (?) structure components as DOUBLE PRECISION *OR* keep them REAL(dp) but *fix* your "dp" definition, as George Bosilca suggested. Gus Correa On 01/08/2015 06:36 PM, Diego Avesani wrote: Dear Gus, Dear All, so are you suggesting to

Re: [OMPI users] MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-13 Thread Gus Correa
(as you did in your previous code, with all the surprises regarding alignment, etc), not array sections. Also, MPI type vector should be more easy going (and probably more efficient) than MPI type struct, with less memory alignment problems. I hope this helps, Gus Correa PS - These books have a

Re: [OMPI users] MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-15 Thread Gus Correa
al/MPI/content6.html Gus Correa On 01/15/2015 06:53 PM, Diego Avesani wrote: dear George, dear Gus, dear all, Could you please tell me where I can find a good example? I am sorry but I can not understand the 3D array. Really Thanks Diego On 15 January 2015 at 20:13, George Bosilca mailto:bosi

[OMPI users] How to handle strides in MPI_Create_type_subarray - Re: MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-16 Thread Gus Correa
Is there any simple example of how to achieve stride effect with MPI_Create_type_subarray in a multi-dimensional array? BTW, when are you gentlemen going to write an updated version of the "MPI - The Complete Reference"? :) Thank you, Gus Correa (Hijacking Diego Avesani's thread, a

Re: [OMPI users] How to handle strides in MPI_Create_type_subarray - Re: MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-16 Thread Gus Correa
Hi George Many thanks for your answer and interest in my questions. ... so ... more questions inline ... On 01/16/2015 03:41 PM, George Bosilca wrote: Gus, Please see my answers inline. On Jan 16, 2015, at 14:24 , Gus Correa wrote: Hi George It is still not clear to me how to deal with

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
s obscure about this, not making clear the difference between /export/apps and /share/apps. Issuing the Rocks commands: "tentakel 'ls -d /export/apps'" "tentakel 'ls -d /share/apps'" may show something useful. I hope this helps, Gus Correa On 02/27/2015 11:47

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
Hi Syed Ahsan Ali To avoid any leftovers and further confusion, I suggest that you delete completely the old installation directory. Then start fresh from the configure step with the prefix pointing to --prefix=/share/apps/openmpi-1.8.4_gcc-4.9.2 I hope this helps, Gus Correa On 02/27/2015 12

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
a common cause of trouble. OpenMPI needs PATH and LD_LIBRARY_PATH at runtime also. I hope this helps, Gus Correa On Fri, Feb 27, 2015 at 10:44 PM, Syed Ahsan Ali wrote: Dear Gus Thanks once again for suggestion. Yes I did that before installation to new path. I am getting error now about some

Re: [OMPI users] No core dump in some cases

2016-05-09 Thread Gus Correa
I do this on the pbs_mom daemon init script (I am still before the systemd era, that lovely POS). And set the hard/soft limits on /etc/security/limits.conf as well. I hope this helps, Gus Correa On 05/07/2016 12:27 PM, Jeff Squyres (jsquyres) wrote: I'm afraid I don't know what a .btr

Re: [OMPI users] No core dump in some cases

2016-05-10 Thread Gus Correa
carnation of an OpenMPI 1.6.5 question similar to yours (where .btr stands for backtrace): http://stackoverflow.com/questions/25275450/cause-all-processes-running-under-openmpi-to-dump-core Could this be due to a (unlikely) mix of OpenMPI 1.10 with 1.6.5? Gus Correa On Mon, May 9, 2016 at 12:04

Re: [OMPI users] [slightly off topic] hardware solutions with monetary cost in mind

2016-05-20 Thread Gus Correa
info/beowulf I hope this helps, Gus Correa

Re: [OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-15 Thread Gus Correa
ed 3) See also this FAQ related to registered memory. I set these parameters in /etc/modprobe.d/mlx4_core.conf, but where they're set may depend on the Linux distro/release and the OFED you're using. https://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem I hope this helps,

Re: [OMPI users] "failed to create queue pair" problem, but settings appear OK

2016-06-15 Thread Gus Correa
(#18 in tuning runtime MPI to OpenFabrics) regards the OFED kernel module parameters log_num_mtt and log_mtts_per_seg, not to the openib btl mca parameters. They may default to a less-than-optimal value. https://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem Gus Correa (not Chuck

Re: [OMPI users] Restart after code hangs

2016-06-16 Thread Gus Correa
r/cluster), but in your case it can be adjusted to how often the program fails. All atmosphere/ocean/climate/weather_forecast models work this way (that's what we mostly run here). I guess most CFD, computational Chemistry, etc, programs also do. I hope this helps, Gus Correa On 06/16/2016 0

Re: [OMPI users] how to build with memchecker using valgrind, preferable linux distro install of valgrind?

2016-07-14 Thread Gus Correa
Maybe just --with-valgrind or --with-valgrind=/usr would work? On 07/14/2016 11:32 AM, David A. Schneider wrote: I thought it would be a good idea to build a debugging version of openmpi 1.10.3. Following the instructions in the FAQ: https://www.open-mpi.org/faq/?category=debugging#memchecker_ho

Re: [OMPI users] MPI_ABORT was invoked on rank 0 in communicator compute with errorcode 59

2016-11-15 Thread Gus Correa
e more user friendly. You could also compile it with the flag -traceback (or -fbacktrace, the syntax depends on the compiler, check the compiler man page). This at least will tell you the location in the program where the segmentation fault happened (in the STDERR file of your job). I hope this h

Re: [OMPI users] Help

2017-04-27 Thread Gus Correa
: command not found” I am following the instruction from here: https://na-inet.jp/na/pccluster/centos_x86_64-en.html Any help is much appreciated. J Corina You need to install openmpi.x86_64 also, not only openmpi-devel.x86_64. That is the minimum. I hope this helps, Gus Correa

Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-13 Thread Gus Correa
Have you tried: -mca btl vader,openib,self or -mca btl sm,openib,self by chance? That adds a btl for intra-node communication (vader or sm). On 07/13/2017 05:43 PM, Boris M. Vulovic wrote: I would like to know how to invoke InfiniBand hardware on CentOS 6x cluster with OpenMPI (static li

Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-17 Thread Gus Correa
aq/?category=all#tcp-selection BTW, some of your questions (and others that you may hit later) are covered in the OpenMPI FAQ: https://www.open-mpi.org/faq/?category=all I hope this helps, Gus Correa On 07/17/2017 12:43 PM, Boris M. Vulovic wrote: Gus, Gilles, Russell, John: Thanks very much f

  1   2   3   4   5   6   >