Re: [OMPI users] MPI_COMPLEX16

2012-05-23 Thread David Singleton
On 05/23/2012 07:30 PM, Patrick Le Dot wrote: David Singleton anu.edu.au> writes: I should have checked earlier - same for MPI_COMPLEX and MPI_COMPLEX8. David On 04/27/2012 08:43 AM, David Singleton wrote: Apologies if this has already been covered somewhere. One of our users

Re: [OMPI users] MPI_COMPLEX16

2012-04-26 Thread David Singleton
I should have checked earlier - same for MPI_COMPLEX and MPI_COMPLEX8. David On 04/27/2012 08:43 AM, David Singleton wrote: Apologies if this has already been covered somewhere. One of our users has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4 but not in 1.4.3 while

[OMPI users] MPI_COMPLEX16

2012-04-26 Thread David Singleton
Apologies if this has already been covered somewhere. One of our users has noticed that MPI_COMPLEX16 is flagged as an invalid type in 1.5.4 but not in 1.4.3 while MPI_DOUBLE_COMPLEX is accepted for both. This is with either gfortran or intel-fc. Superficially, the configure looks the same for

Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for OpenMPI usage

2011-11-06 Thread David Singleton
On 11/05/2011 09:11 AM, Blosch, Edwin L wrote: .. I know where you're coming from, and I probably didn't title the post correctly because I wasn't sure what to ask. But I definitely saw it, and still see it, as an OpenMPI issue. Having /tmp mounted over NFS on a stateless cluster is not a

Re: [OMPI users] openmpi/pbsdsh/Torque problem

2011-04-03 Thread David Singleton
On 04/04/2011 12:56 AM, Ralph Castain wrote: What I still don't understand is why you are trying to do it this way. Why not just run time mpirun -v -x LD_LIBRARY_PATH -x PATH -np 2 -machinefile .machineN /home/lma712/src/Virgin_10.1/lapw1Q_mpi lapw1Q_1.def where machineN contains the names

Re: [OMPI users] openmpi/pbsdsh/Torque problem

2011-04-03 Thread David Singleton
You can prove this to yourself rather easily. Just ssh to a remote node and execute any command that lingers for awhile - say something simple like "sleep". Then kill the ssh and do a "ps" on the remote node. I guarantee that the command will have died. H ... vayu1:~ > ssh v37 sleep 60

Re: [OMPI users] Sending large boradcasts

2011-01-03 Thread David Singleton
Hi Brock, That message should only be 2MB. Are you sure its not a mismatch of message lengths in MPI_Bcast calls? David On 01/04/2011 03:47 AM, Brock Palen wrote: I have a user who reports that sending a broadcast of 540*1080 of reals (just over 2GB) fails with this: *** An error occurred

Re: [OMPI users] Running OpenMPI on SGI Altix with 4096 cores : very poor performance

2010-12-22 Thread David Singleton
Is the same level of processes and memory affinity or binding being used? On 12/21/2010 07:45 AM, Gilbert Grosdidier wrote: Yes, there is definitely only 1 process per core with both MPI implementations. Thanks, G. Le 20/12/2010 20:39, George Bosilca a écrit : Are your processes places the

Re: [OMPI users] Open MPI vs IBM MPI performance help

2010-12-02 Thread David Singleton
http://www.open-mpi.org/faq/?category=running#oversubscribing On 12/03/2010 06:25 AM, Price, Brian M (N-KCI) wrote: Additional testing seems to show that the problem is related to barriers and how often they poll to determine whether or not it's time to leave. Is there some MCA parameter or

Re: [OMPI users] Memory affinity

2010-09-27 Thread David Singleton
On 09/28/2010 06:52 AM, Tim Prince wrote: On 9/27/2010 12:21 PM, Gabriele Fatigati wrote: HI Tim, I have read that link, but I haven't understood if enabling processor affinity are enabled also memory affinity because is written that: "Note that memory affinity support is enabled only when pro

Re: [OMPI users] spin-wait backoff

2010-09-03 Thread David Singleton
On 09/03/2010 10:05 PM, Jeff Squyres wrote: On Sep 3, 2010, at 12:16 AM, Ralph Castain wrote: Backing off the polling rate requires more application-specific logic like that offered below, so it is a little difficult for us to implement at the MPI library level. Not saying we eventually won't

[OMPI users] spin-wait backoff

2010-09-02 Thread David Singleton
I'm sure this has been discussed before but having watched hundreds of thousands of cpuhrs being wasted by difficult-to-detect hung jobs, I'd be keen to know why there isn't some sort of "spin-wait backoff" option. For example, a way to specify spin-wait for x seconds/cycles/iterations then backo

Re: [OMPI users] Open MPI 1.4.2 released

2010-05-27 Thread David Singleton
On 05/28/2010 08:20 AM, Jeff Squyres wrote: On May 16, 2010, at 5:21 AM, Aleksej Saushev wrote: http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/parallel/openmpi/patches/ Sorry for the high latency reply... aa: We haven't added RPATH support yet. We've talked about it but never done it. There a

Re: [OMPI users] Hide Abort output

2010-03-31 Thread David Singleton
Yes, Dick has isolated the issue - novice users often believe Open MPI (not their application) had a problem. Anything along the lines he suggests can only help. David On 04/01/2010 01:12 AM, Richard Treumann wrote: I do not know what the OpenMPI message looks like or why people want to hide

Re: [OMPI users] Hide Abort output

2010-03-31 Thread David Singleton
I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would

Re: [OMPI users] Parallel file write in fortran (+mpi)

2010-02-02 Thread David Singleton
Feb 2, 2010 at 5:59 PM, David Singleton wrote: But its a very bad idea on a "coherent", "POSIX" filesystem like Lustre. Locks have to bounce around between the nodes for every write. This can be VERY slow (even for trivial amounts of "logging" IO) and thrash the

Re: [OMPI users] Parallel file write in fortran (+mpi)

2010-02-02 Thread David Singleton
But its a very bad idea on a "coherent", "POSIX" filesystem like Lustre. Locks have to bounce around between the nodes for every write. This can be VERY slow (even for trivial amounts of "logging" IO) and thrash the filesystem for other users. So, yes, at our site, we include this sort of "par

Re: [OMPI users] exceedingly virtual memory consumption of MPI, environment if higher-setting "ulimit -s"

2009-12-02 Thread David Singleton
hat the OS and resource consumption effects are of setting 1GB+ stack size on *any* application... Have you tried non-MPI examples, potentially with applications as large as MPI applications but without the complexity of MPI? On Nov 19, 2009, at 3:13 PM, David Singleton wrote: Depending on

Re: [OMPI users] exceedingly virtual memory consumption of MPI, environment if higher-setting "ulimit -s"

2009-11-19 Thread David Singleton
Depending on the setup, threads often get allocated a thread local stack with size equal to the stacksize rlimit. Two threads maybe? David Terry Dontje wrote: A couple things to note. First Sun MPI 8.2.1 is effectively OMPI 1.3.4. I also reproduced the below issue using a C code so I think

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
Hi Ralph, Now I'm in a quandry - if I show you that its actually Open MPI that is propagating the environment then you are likely to "fix it" and then tm users will lose a nice feature. :-) Can I suggest that "least surprise" would require that MPI tasks get exactly the same environment/limits

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
tEnv. That variable is then seen by and can be interpreted (cautiously) in /etc/profile.d/ scripts. A user could set it in the job file (or even qalter it post submission): #PBS -v VARNAME=foo:bar:baz For VARNAME, I think simply "MODULES" or "EXTRAMODULES" could do. With

Re: [OMPI users] custom modules per job (PBS/OpenMPI/environment-modules)

2009-11-17 Thread David Singleton
Hi Michael, I'm not sure why you dont see Open MPI behaving like other MPI's w.r.t. modules/environment on remote MPI tasks - we do. xe:~ > qsub -q express -lnodes=2:ppn=8,walltime=10:00,vmem=2gb -I qsub: waiting for job 376366.xepbs to start qsub: job 376366.xepbs ready [dbs900@x27 ~]$ module

[OMPI users] bug in MPI_Cart_create?

2009-10-13 Thread David Singleton
Looking back through the archives, a lot of people have hit error messages like > [bl302:26556] *** An error occurred in MPI_Cart_create > [bl302:26556] *** on communicator MPI_COMM_WORLD > [bl302:26556] *** MPI_ERR_ARG: invalid argument of some other kind > [bl302:26556] *** MPI_ERRORS_ARE_FATA

[OMPI users] PBS tm error returns

2009-08-13 Thread David Singleton
Maybe this should go to the devel list but I'll start here. In tracking the way the PBS tm API propagates error information back to clients, I noticed that Open MPI is making an incorrect assumption. (I'm looking 1.3.2.) The relevant code in orte/mca/plm/tm/plm_tm_module.c is: /* TM poll f

Re: [OMPI users] pgi and gcc runtime compatability

2008-12-07 Thread David Singleton
I seem to remember Fortran logicals being represented differently in PGI to other Fortran (1 vs -1 maybe - cant remember). Causes grief with things like MPI_Test. David Brock Palen wrote: I did something today that I was happy worked, but I want to know if anyone has had problem with it. A

[OMPI users] job abort on MPI task exit

2008-10-27 Thread David Singleton
Apologies if this has been covered in a previous thread - I went back through a lot of posts without seeing anything similar. In an attempt to protect some users from themselves, I was hoping that OpenMPI could be configured so that an MPI task calling exit before calling MPI_Finalize() would ca

Re: [OMPI users] Proper way to throw an error to all nodes?

2008-06-03 Thread David Singleton
This is exactly what MPI_Abort is for. David Terry Frankcombe wrote: Calling MPI_Finalize in a single process won't ever do what you want. You need to get all the processes to call MPI_Finalize for the end to be graceful. What you need to do is have some sort of special message to tell everyo