Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-11 Thread Jeff Squyres
On Jul 11, 2007, at 10:52 AM, Ricardo Reis wrote: Whoa -- if you are failing here, something is definitely wrong: this is failing when accessing stack memory! Are you able to compile/run other trivial and non-trivial C++ applications using your Intel compiler installation? Please ignore my la

Re: [OMPI users] MPI_Reduce problem

2007-07-11 Thread anyi li
Hi Jelena, int* ttt = (int*)malloc(2 * sizeof(int)); ttt[0] = myworldrank + 1; ttt[1] = myworldrank * 2; if(myworldrank == 0) MPI_Reduce(MPI_IN_PLACE, ttt, 2, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); //sum all logdetphi from different nodes else MPI_Reduce(ttt, NULL, 2, MPI_INT, MPI_SUM,

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Ralph Castain
Hooray! Glad we could track it down. The problem here is that you might actually want to provide a set of variables to direct that second orterun's behavior. Fortunately, we actually provided you with a way to do it! You can set any MCA param on the command line by just doing "-mca param value".

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Lev Gelb
Well done, that was exactly the problem - Python's os.environ passes the complete collection of shell variables. I tried a different os method (os.execve) , where I could specify the environment (I took out all the OMPI_* variables) and the second orterun call worked! Now I just need a clea

Re: [OMPI users] MPI_Reduce problem

2007-07-11 Thread Jelena Pjesivac-Grbovic
Hi Anyi, you are using reduce incorrectly: you cannot use the same buffer as input and output. If you want to do operation in place, you must specify "MPI_IN_PLACE" as send buffer at the root process. Thus, your code should look something like: int* ttt = (int*)malloc(2 * sizeof(int

[OMPI users] MPI_Reduce problem

2007-07-11 Thread anyili
Hi, I have a code which have a identical vector on each node, I am going to do the vector sum and return result to root. Such like this, int* ttt = (int*)malloc(2 * sizeof(int)); ttt[0] = myworldrank + 1; ttt[1] = myworldrank * 2; MPI_Allreduce(ttt, ttt, 2, MPI_INT, MPI_SUM, MPI_COMM_

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Ralph Castain
Hmmm...interesting. As a cross-check on something - if you os.system, does your environment by any chance get copied across? Reason I ask: we set a number of environmental variables when orterun spawns a process. If you call orterun from within that process - and the new orterun sees the enviro var

Re: [OMPI users] Problems running openmpi under os x

2007-07-11 Thread Brian Barrett
That's unexpected. If you run the command 'ompi_info --all', it should list (towards the top) things like the Bindir and Libdir. Can you see if those have sane values? If they do, can you try running a simple hello, world type MPI application (there's one in the OMPI tarball). It almost

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Lev Gelb
Thanks for the suggestions. The separate 'orted' scheme (below) did not work, unfortunately; same behavior as before. I have conducted a few other simple tests, and found: 1. The problem only occurs if the first process is "in" MPI; if it doesn't call MPI_Init or calls MPI_Finalize before it

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Ralph H Castain
Hmmm...well, what that indicates is that your application program is losing the connection to orterun, but that orterun is still alive and kicking (it is alive enough to send the [0,0,1] daemon a message ordering it to exit). So the question is: why is your application program dropping the connecti

Re: [OMPI users] Recursive use of "orterun" (Ralph H Castain)

2007-07-11 Thread Lev Gelb
OK, I've added the debug flags - when I add them to the os.system instance of orterun, there is no additional input, but when I add them to the orterun instance controlling the python program, I get the following: orterun -np 1 --debug-daemons -mca odls_base_verbose 1 python ./test.py Daemon [

Re: [OMPI users] Problems running openmpi under os x

2007-07-11 Thread Tim Cornwell
Open MPI: 1.2.3 Open MPI SVN revision: r15136 Open RTE: 1.2.3 Open RTE SVN revision: r15136 OPAL: 1.2.3 OPAL SVN revision: r15136 Prefix: /usr/local Configured architecture: i386-apple-darwin8.10.1 Hi Brian, 1.2

Re: [OMPI users] Recursive use of "orterun"

2007-07-11 Thread Ralph H Castain
I'm unaware of any issues that would cause it to fail just because it is being run via that interface. The error message is telling us that the procs got launched, but then orterun went away unexpectedly. Are you seeing your procs complete? We do sometimes see that message due to a race condition

Re: [OMPI users] openmpi fails on mx endpoint

2007-07-11 Thread Tim Prins
Or you can simply tell the mx mtl not to run by adding "-mca mtl ^mx" to the command line. George: There is an open bug about this problem: https://svn.open-mpi.org/trac/ompi/ticket/1080 Tim George Bosilca wrote: There seems to be a problem with MX, because a conflict between out MTL and t

[OMPI users] Recursive use of "orterun"

2007-07-11 Thread Lev Gelb
Hi - I'm trying to port an application to use OpenMPI, and running into a problem. The program (written in Python, parallelized using either of "pypar" or "pyMPI") itself invokes "mpirun" in order to manage external, parallel processes, via something like: orterun -np 2 python myapp.py whe

Re: [OMPI users] openmpi fails on mx endpoint

2007-07-11 Thread George Bosilca
There seems to be a problem with MX, because a conflict between out MTL and the BTL. So, I suspect that if you want it to run [right now] you should spawn less than the MX supported endpoint by node (one less). I'll take a look this afternoon. Thanks, george. On Jul 11, 2007, at 12:3

Re: [OMPI users] openmpi fails on mx endpoint

2007-07-11 Thread Warner Yuen
The hostfile was changed around. As we tried to pull nodes out that we thought might have been bad. But none were over subscribed if that's what you mean. Warner Yuen Scientific Computing Consultant Apple Computer On Jul 11, 2007, at 9:00 AM, users-requ...@open-mpi.org wrote: Message: 3

Re: [OMPI users] OMPI users] openmpi fails on mx endpoint busy

2007-07-11 Thread George Bosilca
What's in the hostmx10g file ? How many hosts ? george. On Jul 11, 2007, at 1:34 AM, Warner Yuen wrote: I've also had someone run into the endpoint busy problem. I never figured it out, I just increased the default endpoints on MX-10G from 8 to 16 endpoints to make the problem go away. He

Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-11 Thread Ricardo Reis
On Tue, 10 Jul 2007, Jeff Squyres wrote: Whoa -- if you are failing here, something is definitely wrong: this is failing when accessing stack memory! Are you able to compile/run other trivial and non-trivial C++ applications using your Intel compiler installation? Please ignore my last reply.

Re: [OMPI users] Problems running openmpi under os x

2007-07-11 Thread Brian Barrett
Which version of Open MPI are you using? Thanks, Brian On Jul 11, 2007, at 3:32 AM, Tim Cornwell wrote: I have a problem running openmpi under OS 10.4.10. My program runs fine under debian x86_64 on an opteron but under OS X on a number of Mac Book and Mac Book Pros, I get the following

Re: [OMPI users] mpi with icc, icpc and ifort :: segfault (Jeff Squyres)

2007-07-11 Thread Ricardo Reis
On Tue, 10 Jul 2007, Jeff Squyres wrote: Whoa -- if you are failing here, something is definitely wrong: this is failing when accessing stack memory! Are you able to compile/run other trivial and non-trivial C++ applications using your Intel compiler installation? I don't have trivial or non-

[OMPI users] Problems running openmpi under os x

2007-07-11 Thread Tim Cornwell
I have a problem running openmpi under OS 10.4.10. My program runs fine under debian x86_64 on an opteron but under OS X on a number of Mac Book and Mac Book Pros, I get the following immediately on startup. This smells like a common problem but I could find anything relevant anywhere. Ca

[OMPI users] OMPI users] openmpi fails on mx endpoint busy

2007-07-11 Thread Warner Yuen
I've also had someone run into the endpoint busy problem. I never figured it out, I just increased the default endpoints on MX-10G from 8 to 16 endpoints to make the problem go away. Here's the actual command and error before setting the endpoints to 16. The version is MX-1.2.1with OMPI 1.2