Dear Users,
I'm measuring barrier synchronization performance on the v1.5.1 build of
OpenMPI. I am currently trying to measure synchronization performance on a
single node, with 5 processes. I'm getting pretty weak results as follows:
Testing procedure - initialize the timer at the start of the
Hi Ashan
Small stacksize sometimes causes segmentation fault,
specially on large programs like WRF.
However, it is not the only possible cause, of course.
Are you sure you set the stacksize unlimited on *all* nodes
where WRF ran?
It may be tricky.
Ask your system administrator to do it on a per
Hello Gus, Jody
The system has enough memory. I unlimited the stack size before runnning
WRF by the command *ulimit -s unlimited*.But he problem occured.
Thanks
Hi Ahsan, Jody
>
> Just a guess that this may be a stack size problem.
> Did you try to run WRF with unlimited stack size?
> Also, does
Jeff:
> FWIW: I have rarely seen this to be the issue.
Been bitten by similar situations before. But it may not have been OpenMPI.
In any case it was a while back.
> In short, programs are erroneous that do not guarantee that all their
> outstanding requests have completed before calling fina
On Feb 23, 2011, at 3:54 PM, Shamis, Pavel wrote:
> I remember that I updated the trunk to select by default RDMACM connection
> manager for RoCE ports - https://svn.open-mpi.org/trac/ompi/changeset/22311
>
> I'm not sure it the change made his way to any production version. I don't
> work on t
Prentice Bisbal wrote:
Jeff Squyres wrote:
Can you put together a small example that
shows the problem...
Jeff,
Thanks for requesting that. As I was looking at the oringinal code to
write a small test program, I found the source of the error. Doesn't it
aways work that way.
No
Here is what OFA says:
http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBMQFjAA&url=http%3A%2F%2Fwww.openfabrics.org%2Farchives%2Fspring2010sonoma%2FWednesday%2FLiran%2520Liss%2520RoCE%2520in%2520OFED%2Frocee_update_liss.ppt&ei=QW9lTfO-L8HYgQf2tdHhBw&usg=AFQjCNEPltfVwWeZ2d4uvaj1wMpumcxrEw&sig2=
Jeff Squyres wrote:
> On Feb 23, 2011, at 3:36 PM, Prentice Bisbal wrote:
>
>> It's using MPI_STATUS_SIZE to dimension istatus before mpif.h is even
>> read! Correcting the order of the include and declaration statements
>> fixed the problem. D'oh!
>
> A pox on old fortran for letting you use sym
On Feb 23, 2011, at 3:36 PM, Prentice Bisbal wrote:
> It's using MPI_STATUS_SIZE to dimension istatus before mpif.h is even
> read! Correcting the order of the include and declaration statements
> fixed the problem. D'oh!
A pox on old fortran for letting you use symbols before they are declared..
Jeff Squyres wrote:
> On Feb 23, 2011, at 2:20 PM, Prentice Bisbal wrote:
>
>> I suspected that and checked for it earlier. I just double-checked, and
>> that is not the problem. Out of the two source files, 'include mpif.h'
>> appears once, and 'use mpi' does not appear at all. I'm beginning to
>
On Feb 23, 2011, at 2:20 PM, Prentice Bisbal wrote:
> I suspected that and checked for it earlier. I just double-checked, and
> that is not the problem. Out of the two source files, 'include mpif.h'
> appears once, and 'use mpi' does not appear at all. I'm beginning to
> suspect it is the compiler
I would assume you build MPI correctly with ifort?
On Wed, Feb 23, 2011 at 11:20 AM, Prentice Bisbal wrote:
>
>
> Jeff Squyres wrote:
> > I thought the error was this:
> >
> > $ mpif90 -o simplex simplexmain579m.for simplexsubs579
> > /usr/local/openmpi-1.2.8/intel-11/x86_64/include/mpif-config.
Jeff Squyres wrote:
> I thought the error was this:
>
> $ mpif90 -o simplex simplexmain579m.for simplexsubs579
> /usr/local/openmpi-1.2.8/intel-11/x86_64/include/mpif-config.h(88):
> error #6406: Conflicting attributes or multiple declaration of name.
> [MPI_STATUS_SIZE]
> parameter (MPI_ST
Tim Prince wrote:
> On 2/23/2011 8:27 AM, Prentice Bisbal wrote:
>> Jeff Squyres wrote:
>>> On Feb 23, 2011, at 9:48 AM, Tim Prince wrote:
>>>
> I agree with your logic, but the problem is where the code containing
> the error is coming from - it's comping from a header files that's a
>>>
We use srun internally to start the remote daemons. We construct a
nodelist from the user-specified inputs, and pass that to srun so it
knows where to start the daemons.
On Wednesday, February 23, 2011, Henderson, Brent
wrote:
> SLURM seems to be doing this in the case of a regular srun: [brent@
I thought the error was this:
$ mpif90 -o simplex simplexmain579m.for simplexsubs579
/usr/local/openmpi-1.2.8/intel-11/x86_64/include/mpif-config.h(88):
error #6406: Conflicting attributes or multiple declaration of name.
[MPI_STATUS_SIZE]
parameter (MPI_STATUS_SIZE=5)
-^
simp
SLURM seems to be doing this in the case of a regular srun:
[brent@node1 mpi]$ srun -N 2 -n 4 env | egrep
SLURM_NODEID\|SLURM_PROCID\|SLURM_LOCALID | sort
SLURM_LOCALID=0
SLURM_LOCALID=0
SLURM_LOCALID=1
SLURM_LOCALID=1
SLURM_NODEID=0
SLURM_NODEID=0
SLURM_NODEID=1
SLURM_NODEID=1
SLURM_PROCID=0
SLU
On 2/23/2011 8:27 AM, Prentice Bisbal wrote:
Jeff Squyres wrote:
On Feb 23, 2011, at 9:48 AM, Tim Prince wrote:
I agree with your logic, but the problem is where the code containing
the error is coming from - it's comping from a header files that's a
part of Open MPI, which makes me think this
Jeff Squyres wrote:
> On Feb 23, 2011, at 9:48 AM, Tim Prince wrote:
>
>>> I agree with your logic, but the problem is where the code containing
>>> the error is coming from - it's comping from a header files that's a
>>> part of Open MPI, which makes me think this is a cmpiler error, since
>>> I'
Resource managers generally frown on the idea of any program passing
RM-managed envars from one node to another, and this is certainly true of
slurm. The reason is that the RM reserves those values for its own use when
managing remote nodes. For example, if you got an allocation and then used
mpiru
Hi Everyone, I have an OpenMPI/SLURM specific question,
I'm using MPI as a launcher for another application I'm working on and it is
dependent on the SLURM environment variables making their way into the a.out's
environment. This works as I need if I use HP-MPI/PMPI, but when I use
OpenMPI, it
On Feb 23, 2011, at 9:48 AM, Tim Prince wrote:
>> I agree with your logic, but the problem is where the code containing
>> the error is coming from - it's comping from a header files that's a
>> part of Open MPI, which makes me think this is a cmpiler error, since
>> I'm sure there are plenty of p
On 2/23/2011 6:41 AM, Prentice Bisbal wrote:
Tim Prince wrote:
On 2/22/2011 1:41 PM, Prentice Bisbal wrote:
One of the researchers I support is writing some Fortran code that uses
Open MPI. The code is being compiled with the Intel Fortran compiler.
This one line of code:
integer ierr,istatu
Jeff Squyres wrote:
> On Feb 22, 2011, at 4:41 PM, Prentice Bisbal wrote:
>
>> One of the researchers I support is writing some Fortran code that uses
>> Open MPI. The code is being compiled with the Intel Fortran compiler.
>> This one line of code:
>>
>> integer ierr,istatus(MPI_STATUS_SIZE)
>>
>
Tim Prince wrote:
> On 2/22/2011 1:41 PM, Prentice Bisbal wrote:
>> One of the researchers I support is writing some Fortran code that uses
>> Open MPI. The code is being compiled with the Intel Fortran compiler.
>> This one line of code:
>>
>> integer ierr,istatus(MPI_STATUS_SIZE)
>>
>> leads to
25 matches
Mail list logo