Apologies for the dumb question... There used to be a way to dive in to see
exactly what bugs and features came into 1.10.4, 1.10.3, and on back to 1.8.8.
Is there a way to do that on github?
Ed
___
users mailing list
users@lists.open-mpi.org
https:/
a
similar error? Does the application call comm_spawn, for example? Or is it a
script that eventually attempts to launch another job?
> On Jul 28, 2016, at 6:24 PM, Blosch, Edwin L wrote:
>
> Cray CS400, RedHat 6.5, PBS Pro (but OpenMPI is built --without-tm),
> OpenMPI 1.8
] Question on run-time error "ORTE was unable
to reliably start"
What kind of system was this on? ssh, slurm, ...?
> On Jul 28, 2016, at 1:55 PM, Blosch, Edwin L wrote:
>
> I am running cases that are starting just fine and running for a few hours,
> then they die with a mes
I am running cases that are starting just fine and running for a few hours,
then they die with a message that seems like a startup type of failure.
Message shown below. The message appears in standard output from rank 0
process. I'm assuming there is a failing card or port or something.
What
I am confused about backwards-compatibility.
FAQ #111 says:
Open MPI reserves the right to break ABI compatibility at new feature release
series. . MPI applications compiled/linked against Open MPI 1.6.x will not
be ABI compatible with Open MPI 1.7.x
But the versioning documentation says:
place
and installed in another, even after I set OPAL_PREFIX to reflect the installed
location.
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Blosch, Edwin L
Sent: Friday, May 29, 2015 11:06 AM
To: Open MPI Users (us...@open-mpi.org)
Subject: EXTERNAL: [OMPI users] How can I discover
Sometimes I want to use one of the option flags, for example today it is
mtl_mxm_verbose. How do I discover the valid possible values of various MCA
parameters?
I've tried ompi_info --all but it does not show the possible values, only the
current value
I've tried ompi_info --param all b
Nov 11, 2014, at 6:11 AM, Blosch, Edwin L
mailto:edwin.l.blo...@lmco.com>> wrote:
OK, that’s what I was suspecting. It’s a bug, right? I asked for 4 processes
and I supplied a host file with 4 lines in it, and mpirun didn’t launch the
processes where I told it to launch them.
Actual
file to override the default behavior
On Nov 7, 2014, at 8:52 AM, Blosch, Edwin L
mailto:edwin.l.blo...@lmco.com>> wrote:
Here’s my command:
/bin/mpirun --machinefile
hosts.dat -np 4
Here’s my hosts.dat file:
% cat hosts.dat
node01
node02
node03
node04
All 4 ranks are launched on
Here's my command:
/bin/mpirun --machinefile
hosts.dat -np 4
Here's my hosts.dat file:
% cat hosts.dat
node01
node02
node03
node04
All 4 ranks are launched on node01. I don't believe I've ever seen this
before. I had to do a sanity check, so I tried MVAPICH2-2.1a and got what I
expected:
post the output when you run with
mpirun --mca coll_base_verbose 10 "other mpirun args you've been using"
that would be great
Also, if you know the sizes (number of elements) involved in the reduce and
allreduce operations it
would be helpful to know this as well.
Thanks,
H
I had an application suddenly stop making progress. By killing the last
process out of 208 processes, then looking at the stack trace, I found 3 of 208
processes were in an MPI_REDUCE call. The other 205 had progressed in their
execution to another routine, where they were waiting in an unrela
In making the leap from 1.6 to 1.8, how can I check whether or not
process/memory affinity is supported?
I've built OpenMPI on a system where the numactl-devel package was not
installed, and another where it was, but I can't see anything in the output of
ompi_info that suggests any difference b
Sent: Tuesday, April 01, 2014 11:20 AM
To: Open MPI Users
Subject: Re: [OMPI users] Problem building OpenMPI 1.8 on RHEL6
On Apr 1, 2014, at 10:26 AM, "Blosch, Edwin L" wrote:
> I am getting some errors building 1.8 on RHEL6. I tried autoreconf as
> suggested, but it failed f
k the remote shell.
On Apr 7, 2014, at 1:53 PM, Blosch, Edwin L wrote:
> Thanks Noam, that makes sense.
>
> Yes, I did mean to do ". hello" (with space in between). That was an attempt
> to replicate whatever OpenMPI is doing.
>
> In the first post I mentioned that
m: users [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein
Sent: Monday, April 07, 2014 3:41 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Problem with shell when launching jobs
with OpenMPI 1.6.5 rsh
On Apr 7, 2014, at 4:36 PM, Blosch, Edwin L wrote:
> I
22:04 schrieb Blosch, Edwin L:
> I am submitting a job for execution under SGE. My default shell is /bin/csh.
Where - in SGE or on the interactive command line you get?
> The script that is submitted has #!/bin/bash at the top. The script runs on
> the 1st node allocated to the
shell when launching jobs with
OpenMPI 1.6.5 rsh
Looks to me like the problem is here:
/bin/.: Permission denied.
Appears you don't have permission to exec bash??
On Apr 7, 2014, at 1:04 PM, Blosch, Edwin L
mailto:edwin.l.blo...@lmco.com>> wrote:
I am submitting a job for execution u
I am submitting a job for execution under SGE. My default shell is /bin/csh.
The script that is submitted has #!/bin/bash at the top. The script runs on
the 1st node allocated to the job. The script runs a Python wrapper that
ultimately issues the following mpirun command:
/apps/local/test/
I am getting some errors building 1.8 on RHEL6. I tried autoreconf as
suggested, but it failed for the same reason. Is there a minimum version of m4
required that is newer than that provided by RHEL6?
Thanks
aclocal.m4:16: warning: this file was generated for autoconf 2.69.
You have another v
Why does ompi_info -c say "MPI I/O Support: yes" even though I configured using
-disable-io-romio? If ompi_info is going to tell me MPI I/O is supported, then
shouldn't I expect my test program (attached) to work correctly? (it doesn't).
I didn't disable "built-in" mpi-io, only io-romio.
--
CP BTL), not so
good for others (e.g., openib is flat-out not thread safe).
On Dec 18, 2013, at 3:57 PM, Blosch, Edwin L
mailto:edwin.l.blo...@lmco.com>> wrote:
I was wondering if the FAQ entry below is considered current opinion or perhaps
a little stale. Is multi-threading still c
I was wondering if the FAQ entry below is considered current opinion or perhaps
a little stale. Is multi-threading still considered to be 'lightly tested'?
Are there known open bugs?
Thank you,
Ed
7. Is Open MPI thread safe?
Support for MPI_THREAD_MULTIPLE (i.e., multiple threads executing
r mpirun command. If this allows your application to run to
completion then we know exactly where to start looking.
George.
On Jun 27, 2013, at 19:59 , "Blosch, Edwin L"
mailto:edwin.l.blo...@lmco.com>> wrote:
The debug version also hung, roughly the same amount of progre
[mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Thursday, June 27, 2013 12:48 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Application hangs on mpi_waitall
Attached is the message list for rank 0 for the communication step that is
failing. There are about 160
The debug version also hung, roughly the same amount of progress in the
computations (although of course it took much longer to make that progress in
comparison to the optimized version).
On the bright side, the idea of putting an mpi_barrier after the irecvs and
before the isends appears to ha
Attached is the message list for rank 0 for the communication step that is
failing. There are about 160 isends and irecvs. The ‘message size’ is
actually a number of cells. On some steps only one 8-byte word per cell is
communicated, at another step we exchange 7 words, and another step we ex
I'm running OpenMPI 1.6.4 and seeing a problem where mpi_waitall never returns.
The case runs fine with MVAPICH. The logic associated with the communications
has been extensively debugged in the past; we don't think it has errors. Each
process posts non-blocking receives, non-blocking sends,
7;d have to
>> leave it to Mellanox to advise.
>>
>>
>> On Jun 11, 2013, at 6:55 AM, "Blosch, Edwin L"
>> mailto:edwin.l.blo...@lmco.com>>
>> wrote:
>>
>>> I tried adding "-mca btl openib,sm,self" but it did not make any
>
;t using IB for some reason when extended to the
other nodes. What does your cmd line look like? Have you tried adding "-mca btl
openib,sm,self" just to ensure it doesn't use TCP for some reason?
On Jun 9, 2013, at 4:31 PM, "Blosch, Edwin L"
mailto:edwin.l.blo...@lmco.c
, just to be sure - when you run 320 "cores", you are running across 20 nodes?
Just want to ensure we are using "core" the same way - some people confuse
cores with hyperthreads.
On Jun 9, 2013, at 3:50 PM, "Blosch, Edwin L"
mailto:edwin.l.blo...@lmco.com>> wro
okay thru 160, and then things fall apart after
that point. How many cores are on a node?
On Jun 9, 2013, at 1:59 PM, "Blosch, Edwin L"
mailto:edwin.l.blo...@lmco.com>> wrote:
I'm having some trouble getting good scaling with OpenMPI 1.6.4 and I don't
know whe
I'm having some trouble getting good scaling with OpenMPI 1.6.4 and I don't
know where to start looking. This is an Infiniband FDR network with Sandy
Bridge nodes. I am using affinity (--bind-to-core) but no other options. As
the number of cores goes up, the message sizes are typically going do
f processes is a power of two. You'll see that n8
> is faster than n7, so this is likely the situation.
>
>
> On Jun 6, 2013, at 4:10 PM, "Blosch, Edwin L" wrote:
>
>> I am running single-node Sandy Bridge cases with OpenMPI and looking at
>> scaling.
&
I am running single-node Sandy Bridge cases with OpenMPI and looking at scaling.
I'm using -bind-to-core without any other options (default is -bycore I
believe).
These numbers indicate number of cores first, then the second digit is the run
number (except for n=1, all runs repeated 3 times).
n11 ~]$
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Wednesday, June 05, 2013 11:14 AM
To: Open MPI Users (us...@open-mpi.org)
Subject: EXTERNAL: [OMPI users] How to diagnose bus error with 1.6.4
I am running into a bus error that doe
I am running into a bus error that does not happen with MVAPICH, and I am
guessing it has something to do with shared-memory communication. Has anyone
had a similar experience or have any insights on what this could be?
Thanks
[k1n08:12688] mca: base: components_open: Looking for shmem compone
com]
Sent: Wednesday, May 29, 2013 3:31 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem building OpenMPI 1.6.4 with PGI 13.4
Edwin --
Can you ask PGI support about this? I swear that the PGI compiler suite has
supported offsetof before.
On May 29, 2013, at 5:26 PM, &quo
ff Squyres (jsquyres) wrote:
> Edwin --
>
> Can you ask PGI support about this? I swear that the PGI compiler suite has
> supported offsetof before.
>
>
> On May 29, 2013, at 5:26 PM, "Blosch, Edwin L"
> wrote:
>
> > I?m having trouble building OpenMPI
I'm having trouble building OpenMPI 1.6.4 with PGI 13.4. Suggestions?
checking alignment of double... 8
checking alignment of long double... 8
checking alignment of float _Complex... 4
checking alignment of double _Complex... 8
checking alignment of long double _Complex... 8
checking alignment of
The FAQ talks about building support for memory affinity by adding
-with-libnuma=
However, I did not do that, and yet when I check ompi_info, it looks like there
is support from the hwloc module.
Can I assume the FAQ is a little stale and that -with-libnuma is not really
necessary anymore?
[b
___
From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] on behalf of Tim
Prince [n...@aol.com]
Sent: Wednesday, May 22, 2013 10:24 AM
To: us...@open-mpi.org
Subject: EXTERNAL: Re: [OMPI users] basic questions about compiling OpenMPI
On 5/22/2013 11:34 AM, Paul Kapinos wro
Apologies for not exploring the FAQ first.
If I want to use Intel or PGI compilers but link against the OpenMPI that ships
with RedHat Enterprise Linux 6 (compiled with g++ I presume), are there any
issues to watch out for, during linking?
Thanks,
Ed
users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Reuti
Sent: Tuesday, December 18, 2012 4:14 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Problems with shared libraries while
launching jobs
Am 17.12.2012 um 16:42 schrieb Blosch, Edwin L:
&g
t: EXTERNAL: Re: [OMPI users] Problems with shared libraries while
launching jobs
Add -mca plm_base_verbose 5 --leave-session-attached to the cmd line - that
will show the ssh command being used to start each orted.
On Dec 14, 2012, at 12:17 PM, "Blosch, Edwin L"
mailto:edwin.l.blo...@lmco
I am having a weird problem launching cases with OpenMPI 1.4.3. It is most
likely a problem with a particular node of our cluster, as the jobs will run
fine on some submissions, but not other submissions. It seems to depend on the
node list. I just am having trouble diagnosing which node, and
ion. If so, you could do:
mpirun -npersocket 2 -bind-to-socket ...
That would put two processes in each socket, bind them to that socket, and rank
them in series. So ranks 0-1 would be bound to the first socket, ranks 2-3 to
the second.
Ralph
On Thu, Nov 8, 2012 at 6:52 AM, Blosc
Thanks, I definitely appreciate the new, hotness of hwloc. I just couldn't
tell from the documentation or the web page how or if it was being used by
OpenMPI.
I still work with OpenMPI 1.4.x and now that I've looked into the builds, I
think I understand that PLPA is used in 1.4 and hwloc is br
Yes it is a Westmere system.
Socket L#0 (P#0 CPUModel="Intel(R) Xeon(R) CPU E7- 8870 @ 2.40GHz"
CPUType=x86_64)
L3Cache L#0 (size=30720KB linesize=64 ways=24)
L2Cache L#0 (size=256KB linesize=64 ways=8)
L1dCache L#0 (size=32KB linesize=64 ways=8)
L1iCache L#0
>>> In your desired ordering you have rank 0 on (socket,core) (0,0) and
>>> rank 1 on (0,2). Is there an architectural reason for that? Meaning
>>> are cores 0 and 1 hardware threads in the same core, or is there a
>>> cache level (say L2 or L3) connecting cores 0 and 1 separate from
>>> cores
I see hwloc is a subproject hosted under OpenMPI but, in reading the
documentation, I was unable to figure out if hwloc is a module within OpenMPI,
or if some of the code base is borrowed into OpenMPI, or something else. Is
hwloc used by OpenMPI internally? Is it a layer above libnuma? Or is
I am trying to map MPI processes to sockets in a somewhat compacted pattern and
I am wondering the best way to do it.
Say there are 2 sockets (0 and 1) and each processor has 4 cores (0,1,2,3) and
I have 4 MPI processes, each of which will use 2 OpenMP processes.
I've re-ordered my parallel wor
I am using this parameter "shmem_mmap_relocate_backing_file" and noticed that
the relocation variable is identified as
"shmem_mmap_opal_shmem_mmap_backing_file_base_dir" in its documentation, but
then the next parameter that appears from ompi_info is spelled differently,
namely "shmem_mmap_back
I am getting a problem where something called "PSM" is failing to start and
that in turn is preventing my job from running. Command and output are below.
I would like to understand what's going on. Apparently this version of OpenMPI
decided to build itself with support for PSM, but if it's no
users] EXTERNAL: Re: How to set up state-less node /tmp for
OpenMPI usage
On 11/05/2011 09:11 AM, Blosch, Edwin L wrote:
..
>
> I know where you're coming from, and I probably didn't title the post
> correctly because I wasn't sure what to ask. But I definitely saw it,
Thanks, Ralph,
> Having a local /tmp is typically required by Linux for proper operation as
> the OS itself needs to ensure its usage is protected, as was > previously
> stated and is reiterated in numerous books on managing Linux systems.
There is a /tmp, but it's not local. I don't know if
conclusion of each batch job, an epilogue
> process runs that removes all files belonging to the owner of the
> current batch job from /tmp (and also looks for and kills orphan
> processes belonging to the user). This epilogue had to written
> by our systems staff.
>
> I believe th
ilto:users-boun...@open-mpi.org] On Behalf
Of Ralph Castain
Sent: Thursday, November 03, 2011 5:22 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How to set up state-less node /tmp for
OpenMPI usage
On Nov 3, 2011, at 2:55 PM, Blosch, Edwin L wrote:
> I might be missing someth
as missing "btl".)
On 11/3/2011 11:19 AM, Blosch, Edwin L wrote:
> I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session
> file on /tmp, which is NFS-mounted and thus not a good choice.
>
> Are you suggesting something like --mca ^sm?
>
>
I don't tell OpenMPI what BTLs to use. The default uses sm and puts a session
file on /tmp, which is NFS-mounted and thus not a good choice.
Are you suggesting something like --mca ^sm?
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of
ns.
> If you create temporary files using mktemp is it being created in
> /dev/shm or /tmp?
>
>
> On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu wrote:
>> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L
>> wrote:
>>> -/dev/shm is 12 GB and has 755 permis
de /tmp for
OpenMPI usage
On Nov 1, 2011, at 7:31 PM, Blosch, Edwin L wrote:
> I'm getting this message below which is observing correctly that /tmp is
> NFS-mounted. But there is no other directory which has user or group write
> permissions. So I think I'm kind of st
and /dev/shm is (always) local,
/dev/shm seems to be the right place for shared memory transactions.
If you create temporary files using mktemp is it being created in
/dev/shm or /tmp?
On Thu, Nov 3, 2011 at 11:50 AM, Bogdan Costescu wrote:
> On Thu, Nov 3, 2011 at 15:54, Blosch, Edwin L wr
Can anyone guess what the problem is here? I was under the impression that
OpenMPI (1.4.4) would look for /tmp and would create its shared-memory backing
file there, i.e. if you don't set orte_tmpdir_base to anything.
Well, there IS a /tmp and yet it appears that OpenMPI has chosen to use
/dev
I'm getting this message below which is observing correctly that /tmp is
NFS-mounted. But there is no other directory which has user or group write
permissions. So I think I'm kind of stuck, and it sounds like a serious issue.
Before I ask the administrators to change their image, i.e. mount
All,
I'm using OpenMPI 1.4.3 and have been running a particular case on 120, 240,
480 and 960 processes. My time-per-work metric reports 60, 30, 15, 15. If I
do the same run with MVAPICH 1.2, I get 60, 30, 15, 8. There is something
running very slowly with OpenMPI 1.4.3 as the process count g
eep using
them.
Thanks again,
Ed
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Wednesday, September 28, 2011 4:02 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and
you have libnuma installed?
If so, do you have the .h and .so files? Do you have the .a file?
Can you send the last few lines of output from a failed "make V=1" in that
tree? (it'll show us the exact commands used to compile/link, etc.)
On Sep 28, 2011, at 11:55 AM, Blosch,
I am getting some undefined references in building OpenMPI 1.5.4 and I would
like to know how to work around it.
The errors look like this:
/scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o):
In function `hwloc_linux_alloc_membind':
topology-linux.c:(.text+0x1da): und
onday, September 26, 2011 6:16 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Trouble compiling 1.4.3 with PGI 10.9
compilers
On Sep 26, 2011, at 6:53 PM, Blosch, Edwin L wrote:
> Actually I can download OpenMPI 1.5.4, 1.4.4rc3 or 1.4.3 - and ALL of them
> build just fine.
>
we fixed some libtool issues in the 1.4.4 tarball; could you try
the 1.4.4rc and see if that fixes the issue? If not, we might have missed some
patches to bring over to the v1.4 branch.
http://www.open-mpi.org/software/ompi/v1.4/
On Sep 20, 2011, at 1:16 PM, Blosch, Edwin L wrot
irecv, mpi_isend, mpi_waitall;
perhaps there is something unhealthy in the semantics there.
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Wednesday, September 21, 2011 10:44 AM
To: Open MPI Users
Subject: EXTERNAL: [OMPI
-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Tim Prince
Sent: Wednesday, September 21, 2011 10:53 AM
To: us...@open-mpi.org
Subject: EXTERNAL: Re: [OMPI users] Question about compilng with fPIC
On 9/21/2011 11:44 AM, Blosch, Edwin L wrote:
> Follow-up to a mislabeled thread: "How co
.@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Tuesday, September 20, 2011 11:46 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect
floating-point results?
Thank you for this explanation. I will assume that my problem here is some
check.
-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Blosch, Edwin L
Sent: Tuesday, September 20, 2011 12:17 PM
To: Open MPI Users
Subject: EXTERNAL: [OMPI users] Trouble compiling 1.4.3 with PGI 10.9 compilers
I'm having troubl
I'm having trouble building 1.4.3 using PGI 10.9. I searched the list archives
briefly but I didn't stumble across anything that looked like the same problem,
so I thought I'd ask if an expert might recognize the nature of the problem
here.
The configure command:
./configure --prefix=/release
(or whatever) or are you confirming that the back-end
compiler is seeing the same flags? The MPI compiler wrapper (mpicc, et
al.) can add flags. E.g., as I remember it, "mpicc" with no flags means
no optimization with OMPI but with optimization
Subject: Re: [OMPI users] EXTERNAL: Re: How could OpenMPI (or MVAPICH) affect
floating-point results?
On 9/20/2011 10:50 AM, Blosch, Edwin L wrote:
> It appears to be a side effect of linkage that is able to change a
> compute-only routine's answers.
>
> I have assumed that max/sqrt/
nce:
> On 9/20/2011 7:25 AM, Reuti wrote:
>> Hi,
>>
>> Am 20.09.2011 um 00:41 schrieb Blosch, Edwin L:
>>
>>> I am observing differences in floating-point results from an application
>>> program that appear to be related to whether I link with OpenMPI 1.4
I am observing differences in floating-point results from an application
program that appear to be related to whether I link with OpenMPI 1.4.3 or
MVAPICH 1.2.0. Both packages were built with the same installation of Intel
11.1, as well as the application program; identical flags passed to the
Sent: Thursday, September 15, 2011 4:37 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Can you set the gid of the processes
created by mpirun?
Am 15.09.2011 um 01:15 schrieb Blosch, Edwin L:
> I would appreciate trying to fix the multi-word argument to
> orte_launch_agent,
know if you want me to pursue this.
Ralph
On Sep 14, 2011, at 3:31 PM, Blosch, Edwin L wrote:
Thank you - I did pursue this kind of workaround, and it worked, but you'll be
happy to know that nothing had to be owned by root.
ASIDE
Just to remind: The job script is a shell scr
ep 14, 2011 at 12:56 PM, Reuti
mailto:re...@staff.uni-marburg.de>> wrote:
Am 14.09.2011 um 19:02 schrieb Blosch, Edwin L:
> Thanks for trying.
>
> Do you feel that this is an impossible request without the assistance of some
> process running as root, for example, as Reuti mentioned
To: Open MPI Users
> Subject: Re: [OMPI users] EXTERNAL: Re: Can you set the gid of the processes
> created by mpirun?
>
>
> On Sep 14, 2011, at 9:39 AM, Blosch, Edwin L wrote:
>
>> Thanks, Ralph,
>>
>> I get the failure messages, unfortunately:
>>
&g
ilto:users-boun...@open-mpi.org] On Behalf
Of Ralph Castain
Sent: Wednesday, September 14, 2011 11:33 AM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Can you set the gid of the processes
created by mpirun?
On Sep 14, 2011, at 9:39 AM, Blosch, Edwin L wrote:
> Thanks, Ralph,
>
Thanks, Ralph,
I get the failure messages, unfortunately:
setgid FAILED
setgid FAILED
setgid FAILED
I actually had attempted to call setgid from within the application previously,
which looks similar to what you've done, but it failed. That was when I
initiated the post to the mailing list. My
oun...@open-mpi.org] On Behalf
Of Reuti
Sent: Tuesday, September 13, 2011 5:36 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: Problem running under SGE
Am 14.09.2011 um 00:25 schrieb Blosch, Edwin L:
> Your comment guided me in the right direction, Reuti. And overlapped with
>
Sent: Tuesday, September 13, 2011 4:27 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem running under SGE
Am 13.09.2011 um 23:18 schrieb Blosch, Edwin L:
> I'm able to run this command below from an interactive shell window:
>
> /bin/mpirun --machinefile mpiho
Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf
Of Reuti
Sent: Tuesday, September 13, 2011 4:27 PM
To: Open MPI Users
Subject: EXTERNAL: Re: [OMPI users] Problem running under SGE
Am 13.09.2011 um 23:18 schrieb Blosch, Edwin L:
> I'm able to
I'm able to run this command below from an interactive shell window:
/bin/mpirun --machinefile mpihosts.dat -np 16 -mca plm_rsh_agent
/usr/bin/rsh -x MPI_ENVIRONMENT=1 ./test_setup
but it does not work if I put it into a shell script and 'qsub' that script to
SGE. I get the message shown at th
sue might be, but I would check for a typo - we don't check
that mca params are spelled correctly, nor do we check for params that don't
exist (e.g., because you spelled it wrong).
On Sep 12, 2011, at 3:03 PM, Blosch, Edwin L wrote:
I have a hello world program that runs without pro
, 2011 12:05 PM
To: Open MPI Users
Subject: Re: [OMPI users] EXTERNAL: Re: qp memory allocation problem
On Mon, 12 Sep 2011, Blosch, Edwin L wrote:
> Nathan, I found this parameters under /sys/module/mlx4_core/parameters.
> How do you incorporate a changed value? What to rest
I have a hello world program that runs without prompting for password with
plm_rsh_agent but not with orte_rsh_agent, I mean it runs but only after
prompting for a password:
/bin/mpirun --machinefile mpihosts.dat -np 16 -mca plm_rsh_agent
/usr/bin/rsh ./test_setup
Hello from process
utierrez
Los Alamos National Laboratory
On Sep 12, 2011, at 9:23 AM, Blosch, Edwin L wrote:
I am getting this error message below and I don't know what it means or how to
fix it. It only happens when I run on a large number of processes, e.g. 960.
Things work fine on 480, and I don
ue pairs by
> default. Do they buy us anything? For what it is worth, we have stopped
> using them on all of our large systems here at LANL.
>
> Thanks,
>
> Samuel K. Gutierrez
> Los Alamos National Laboratory
>
> On Sep 12, 2011, at 9:23 AM, Blosch, Edwin L wrote:
>
&g
Do they buy us anything? For what it is worth, we have stopped using them on
all of our large systems here at LANL.
Thanks,
Samuel K. Gutierrez
Los Alamos National Laboratory
On Sep 12, 2011, at 9:23 AM, Blosch, Edwin L wrote:
I am getting this error message below and I don't know what
I am getting this error message below and I don't know what it means or how to
fix it. It only happens when I run on a large number of processes, e.g. 960.
Things work fine on 480, and I don't think the application has a bug. Any help
is appreciated...
[c1n01][[30697,1],3][connect/btl_openi
stain [mailto:r...@open-mpi.org]
>> Sent: Wednesday, September 07, 2011 8:53 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Can you set the gid of the processes created by
> mpirun?
>>
>> On Sep 7, 2011, at 7:38 AM, Blosch, Edwin L wrote:
>>
>>
Can you set the gid of the processes created by
mpirun?
On Sep 7, 2011, at 7:38 AM, Blosch, Edwin L wrote:
The mpirun command is invoked when the user's group is 'set group' to group
650. When the rank 0 process creates files, they have group ownership 650.
But the user
The mpirun command is invoked when the user's group is 'set group' to group
650. When the rank 0 process creates files, they have group ownership 650.
But the user's login group is group 1040. The child processes that get started
on other nodes run with group 1040, and the files they create ha
1 - 100 of 107 matches
Mail list logo