Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Doug Thank you for your input. I fully agree with you. I do not expect to get much from hyperthreading in terms of performance. However, at this point I am just interested in having Open MPI working right with *both* HT on and HT off. Anyway, back to your comment about the usefulness of HT. T

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Doug Reeder
Hello, I have a mac with two quad core nehalem chips (8 cores). The sysctl command shows 16 cpus (apparently w/ hyperthreading). I have a finite element code that runs in parallel using openmpi. Running on the single machine using openmpi -np 8 runs in about 2/3 time that running with -np

Re: [OMPI users] Run time error of openmpi 1.4.1

2010-05-04 Thread Ralph Castain
Strange - can you check your prefix/lib area to see if there are any *paffinity* libs in it? For some reason, it looks like nothing built. On May 4, 2010, at 5:37 PM, David Logan wrote: > Hi All, > > I'm having problems withe openmpi 1.4.1 and am receiving the following error > message when I

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Jeff Squyres
I'd actually be a little surprised if HT was the problem. I run with HT enabled on my nehalem boxen all the time. It's pretty surprising that Open MPI is causing a hard lockup of your system; user-level processes shouldn't be able to do that. Notes: 1. With HT enabled, as you noted, Linux wi

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Ralph Thank you so much for your help. You are right, paffinity is turned off (default): ** /opt/sw/openmpi/1.4.2/gnu-4.4.3-4/bin/ompi_info --param opal all | grep paffinity MCA opal: parameter "opal_paffinity_alone" (current value: "0", data source: default val

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Douglas Yes, very helpful indeed! The machine here is a two-way quad-core, and /proc/cpuinfo shows 16 processors, twice as much as the physical cores, just like you see on yours. So, HT is turned on for sure. The security guard opened the office door for me, and I could reboot that machine

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Douglas Guptill
On Tue, May 04, 2010 at 05:34:40PM -0600, Ralph Castain wrote: > > On May 4, 2010, at 4:51 PM, Gus Correa wrote: > > > Hi Ralph > > > > Ralph Castain wrote: > >> One possibility is that the sm btl might not like that you have > >> hyperthreading enabled. > > > > I remember that hyperthreading

[OMPI users] Run time error of openmpi 1.4.1

2010-05-04 Thread David Logan
Hi All, I'm having problems withe openmpi 1.4.1 and am receiving the following error message when I try to run a test job. [root@hydra ~]# mpirun -n 2 --prefix `dirname $MPILIBDIR` -v -show-progress -machinefile ./nodes.to.use -pernode ./dml_test -

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Ralph Castain
On May 4, 2010, at 4:51 PM, Gus Correa wrote: > Hi Ralph > > Ralph Castain wrote: >> One possibility is that the sm btl might not like that you have >> hyperthreading enabled. > > I remember that hyperthreading was discussed months ago, > in the previous incarnation of this problem/thread/disc

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Ralph Ralph Castain wrote: One possibility is that the sm btl might not like that you have hyperthreading enabled. I remember that hyperthreading was discussed months ago, in the previous incarnation of this problem/thread/discussion on "Nehalem vs. Open MPI". (It sounds like one of those

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Jeff Sorry, same problem with v1.4.2. Without any mca parameters set (i.e. withOUT -mca btl ^sm), hello_c.c runs OK for np = 4 and 8. (However slower than with the "sm" turned off in 1.4.1, as suggested by Ralph an hour ago.) Nevertheless, when I try np=16 it segfaults, with the syslog messa

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Ralph Castain
One possibility is that the sm btl might not like that you have hyperthreading enabled. Another thing to check: do you have any paffinity settings turned on (e.g., mpi_paffinity_alone)? Our paffinity system doesn't handle hyperthreading at this time. I'm just suspicious of the HT since you hav

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Jeff Sure, I will certainly try v1.4.2. I am downloading it right now. As of this morning, when I first downloaded, the web site still had 1.4.1. Maybe I should have refreshed the web page on my browser. I will tell you how it goes. Gus Jeff Squyres wrote: Gus -- Can you try v1.4.2 which w

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Jeff Squyres
Gus -- Can you try v1.4.2 which was just released today? On May 4, 2010, at 4:18 PM, Gus Correa wrote: > Hi Ralph > > Thank you very much. > The "-mca btl ^sm" workaround seems to have solved the problem, > at least for the little hello_c.c test. > I just ran it fine up to 128 processes. > > I

Re: [OMPI users] Fortran derived types

2010-05-04 Thread Vedran Coralic
Yes, all the component arrays of the derived type vector are of the same size, though I am not sure that that actually makes the task any easier? I suspected, just as you said, that copying the data into a contiguous block of memory might be the best solution. I was hoping though that I could make

Re: [OMPI users] Fortran derived types

2010-05-04 Thread Cole, Derek E
Others may be able to chime in more, because I am no fortran expert, but you probably will have to copy it into a contiguous block in memory. Working with derived types is hard, especially if they are not uniform. MPI can probably technically handle it, but the programming effort is harder. Are

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Hi Ralph Thank you very much. The "-mca btl ^sm" workaround seems to have solved the problem, at least for the little hello_c.c test. I just ran it fine up to 128 processes. I confess I am puzzled by this workaround. * Why should we turn off "sm" in a standalone machine, where everything is supp

[OMPI users] request_get_status: Recheck request status [PATCH]

2010-05-04 Thread Shaun Jackman
Hi Jeff, request_get_status polls request->req_complete before calling opal_progress. Ideally, it would check req_complete, call opal_progress, and check req_complete one final time. This patch identically mirrors the logic of ompi_request_default_test in ompi/request/req_test.c. We've discussed

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Ralph Castain
I would certainly try it -mca btl ^sm and see if that solves the problem. On May 4, 2010, at 2:38 PM, Eugene Loh wrote: > Gus Correa wrote: > >> Dear Open MPI experts >> >> I need your help to get Open MPI right on a standalone >> machine with Nehalem processors. >> >> How to tweak the mca par

Re: [OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Eugene Loh
Gus Correa wrote: Dear Open MPI experts I need your help to get Open MPI right on a standalone machine with Nehalem processors. How to tweak the mca parameters to avoid problems with Nehalem (and perhaps AMD processors also), where MPI programs hang, was discussed here before. However, I lost

[OMPI users] How do I run OpenMPI safely on a Nehalem standalone machine?

2010-05-04 Thread Gus Correa
Dear Open MPI experts I need your help to get Open MPI right on a standalone machine with Nehalem processors. How to tweak the mca parameters to avoid problems with Nehalem (and perhaps AMD processors also), where MPI programs hang, was discussed here before. However, I lost track of the detail

[OMPI users] Fortran derived types

2010-05-04 Thread Vedran Coralic
Hello, In my Fortran 90 code I use several custom defined derived types. Amongst them is a vector of arrays, i.e. v(:)%f(:,:,:). I am wondering what the proper way of sending this data structure from one processor to another is. Is the best way to just restructure the data by copying it into a vec

[OMPI users] Open MPI 1.4.2 released

2010-05-04 Thread Ralph Castain
The Open MPI Team, representing a consortium of research, academic, and industry partners, is pleased to announce the release of Open MPI version 1.4.2. This release is mainly a bug fix release over the v1.4.1 release. We strongly recommend that all users upgrade to version 1.4.2 if possible.

Re: [OMPI users] Calling MPI from a CGI script

2010-05-04 Thread Jeff Squyres
On Apr 29, 2010, at 2:25 PM, Srujan Enaganti wrote: > I am trying to run an MPI program as a CGI Python script which is running > over an Apache web server running locally on my computer. > > I have a test.py file which has the code snippet > > cmd = 'opt/local/bin/mpiexec -np 10 testmpi' > outp