I forgot to mention that I tried the hello.c version instead of Java and it
too failed in a similar manner, but
1. On a single node with --mca btl ^tcp it went up to 24 procs before
failing
2. On 8 nodes with --mca btl ^tcp it could go only up to 16 procs
On Tue, Mar 11, 2014 at 5:06 PM, Saliya
Good catch - thanks
Ralph
On Mar 11, 2014, at 5:29 PM, tmish...@jcity.maeda.co.jp wrote:
>
>
> Ralph, sorry. I missed a problem in the hwloc_base_util.c file.
> The "static int build_map" still depends on the opal_hwloc_topology.
> (Please see attached patch file)
>
> (See attached file: patch
Ralph, sorry. I missed a problem in the hwloc_base_util.c file.
The "static int build_map" still depends on the opal_hwloc_topology.
(Please see attached patch file)
(See attached file: patch.hwloc_base_util)
Tetsuya
> Ralph, sorry for late confirmation. It worked for me, thanks.
>
> Tetsuya
>
Hi, Vince
Have you tried with a different BTL? In particular, have you tried with the TCP
BTL? Please try setting "-mca btl sm,self,tcp" and see if you still run into
the issue.
How is your OMPI configured?
Josh
> From: Vince Grimes
> Subject: [OMPI users] LOCAL QP OPERATION ERROR
> Date
Ralph, sorry for late confirmation. It worked for me, thanks.
Tetsuya
> I fear that would be a bad thing to do as it would disrupt mpirun's
operations. However, I did fix the problem by adding the topology as a
param to the pretty-print functions. Please see:
>
> https://svn.open-mpi.org/trac/o
I just tested with "ml" turned off as you suggested, but unfortunately it
didn't solve the issue.
However, I found that by explicitly setting --mca btl ^tcp the code worked
on upto 4 nodes with each running 8 procs. If I don't specify this it'll
simply fail even on one node with 8 procs.
Thank yo
Looks like we still have a bug in one of our components -- can you try:
mpirun --mca coll ^ml ...
This will deactivate the "ml" collective component. See if that enables you to
run (this particular component has nothing to do with Java).
On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake wrot
Very thanks to Mehdi and Reuti for your helps.
On Tue, Mar 11, 2014 at 3:46 PM, Mehdi Rahmani wrote:
> Hi
> use --hostfile or --machinefile in your command
> mpirun *--hostfile* texthost -np 2 /home/client3/espresso-5.0.2/bin/pw.x
> -in AdnAu.rx.in | tee AdnAu.rx.out
>
>
> On Tue, Mar 11, 2014
On Mar 11, 2014, at 11:22 AM, Åke Sandgren wrote:
>>> ../configure CC=pgcc CXX=pgCC FC=pgf90 F90=pgf90
>>> --prefix=/usr/local/Cluster-Users/fs395/openmpi-1.7.4/pgi-14.2_cuda-6.0RC
>>> --enable-mpirun-prefix-by-default --with-hcoll=$HCOLL_DIR
>>> --with-fca=$FCA_DIR --with-mxm=$MXM_DIR --wit
On 03/11/2014 04:12 PM, Jeff Squyres (jsquyres) wrote:
I don't see the config.log and make.log attached - can you send all the info
requested here (including config.log and config.out):
http://www.open-mpi.org/community/help/
Can you also send "make V=1" output as well?
On Feb 25, 2014,
I don't see the config.log and make.log attached - can you send all the info
requested here (including config.log and config.out):
http://www.open-mpi.org/community/help/
Can you also send "make V=1" output as well?
On Feb 25, 2014, at 6:22 PM, Filippo Spiga wrote:
> Dear all,
>
> I cam
Alex, Jeff
thanks for your answers.
I understand better now how OMPI works.
The results I see now make much more sense.
Best,
Nikola
From: users [users-boun...@open-mpi.org] on behalf of Jeff Squyres (jsquyres)
[jsquy...@cisco.com]
Sent: Tuesday, March
Yes, you're seeing more-or-less the expected behavior. It's a complicated
issue.
Short version: you might want to sprinkle MPI_Test's throughout your compute
stage to get true overlap.
More detail: MPI's typically use a "rendezvous" protocol for large messages,
meaning that it sends a small f
Seems odd - the Java code is passing all tests on my Linux boxes. A quick
glance shows it failing on memcpy on your machine during MPI_Init, which would
make one suspect either an uninitialized variable or something not getting
loaded correctly.
Oscar, Jose? Any thoughts?
Ralph
On Mar 10, 201
Hi
use --hostfile or --machinefile in your command
mpirun *--hostfile* texthost -np 2 /home/client3/espresso-5.0.2/bin/pw.x
-in AdnAu.rx.in | tee AdnAu.rx.out
On Tue, Mar 11, 2014 at 1:35 PM, raha khalili wrote:
> Dear users
>
> I want to run a quantum espresso program (with passwordless ssh). I
Hi,
Am 11.03.2014 um 11:05 schrieb raha khalili:
> I want to run a quantum espresso program (with passwordless ssh). I prepared
> a hostfile named 'texthost' at my input directory. I get this error when I
> run the program:
>
> texthost:
> # This is a hostfile.
> # I have 4 syetems are parall
Dear users
I want to run a quantum espresso program (with passwordless ssh). I
prepared a hostfile named 'texthost' at my input directory. I get this
error when I run the program:
texthost:
# This is a hostfile.
# I have 4 syetems are paralleled by mpich2
# The following nodes are that machines I
Dear Nikola,
you can check this presentation:
http://classic.chem.msu.su/gran/gamess/mp2par.pdf
for the solution we have been using with Firefly (formerly PC GAMESS) for
more than last ten years.
Hope this helps.
Kind regards,
Alex Granovsky
-Original Message-
From: Velickovic N
Just tested that this happens even with the simple Hello.java program given
in OMPI distribution.
I've made a tarball containing details of the error adhering to
http://www.open-mpi.org/community/help/. Please let me know if I have
missed any info necessary.
Thank you,
Saliya
On Mon, Mar 10,
19 matches
Mail list logo