You might want to run some performance testing of you TCP stacks and
the switch -- use a non-MPI application such as NetPIPE (or others --
google around) and see what kind of throughput you get. Try it
between individual server peers and then try running it simultaneously
between a bunch o
Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1
with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with two
quad core opterons connected by a Gbit ethernet. Running in parallel on
one node (8 cores) runs very well, faster than any other cluster I have
run it
On Aug 17, 2009, at 2:43 PM, Jeff Squyres wrote:
George / Myricom --
Does the MX MTL abort if it gets a "disconnected" error back from
libmyriexpress?
Short answer: yes.
Long answer:
The messages below indicate that these processes were all trying to
send to cl120. It did not ack their
George / Myricom --
Does the MX MTL abort if it gets a "disconnected" error back from
libmyriexpress?
On Aug 11, 2009, at 7:07 AM, Oskar Enoksson wrote:
I searched the FAQ and google but couldn't come up with a solution to
this problem.
My problem is that when one MPI execution host dies
FWIW, if you use the right mpicc/mpif77/mpif90, you shouldn't need to
specify any of the -L or -l options. Those will automatically be
specified by the wrapper compiler.
If you cannot use mpicc/mpif77/mpif90, then see this FAQ entry:
http://www.open-mpi.org/faq/?category=mpi-apps#cant-u
Are you initializing your MPI_Info object? Remember that -- at a
minimum -- you need to call MPI_INFO_CREATE on an MPI_Info object (or
pass MPI_INFO_NULL).
On Aug 17, 2009, at 11:28 AM, Federico Golfrè Andreasi wrote:
Hi!
I have a little code that uses the MPI_Comm_spawn_multiple,
I've us
Hi David
You are quite correct. IIRC, we didn't bother checking the local_err because
we found it to be unreliable - all Torque checks is that the program exec's.
It doesn't report back an error if it segfaults instantly, for example, or
aborts because it fails to find a required library. So we ad
Lee Amy wrote:
I build a Kerrighed Clusters
Like Lenny, I'm not familiar with such clusters, but...
with 4 nodes so they look like a big SMP
machine. every node has 1 processor with dingle core.
1) Dose MPI programs could be running on such kinds of machine? If
yes, could anyone show me som
We tried to make the most common info_keys the same, but there can be
differences. What info keys are you trying to pass?
2009/8/17 Federico Golfrè Andreasi
> Hi!
>
> I have a little code that uses the MPI_Comm_spawn_multiple,
> I've used it without any problems with the MPICH2 and MVAPICH2
> i
Hi!
I have a little code that uses the MPI_Comm_spawn_multiple,
I've used it without any problems with the MPICH2 and MVAPICH2
implementation of MPI-2,
but with the Open MPI v1.3.3 it throws this error:
*** An error occurred in MPI_Comm_spawn_multiple
*** on communicator MPI_COMM_WORLD
*** MPI_ER
jody wrote:
But can you explain what the meaning of the max-slots entry is?
I checked the FAQs
http://www.open-mpi.org/faq/?category=running#simple-spmd-run
http://www.open-mpi.org/faq/?category=running#mpirun-scheduling
but i couldn't find any explanation. (furthermore, in the FAQ it says
"ma
Hi Lenny
After removing the max-slots entries,
i could do
mpirun -np 4 -hostfile th_02 -rf rf_02 ./HelloMPI
without any errors.
But can you explain what the meaning of the max-slots entry is?
I checked the FAQs
http://www.open-mpi.org/faq/?category=running#simple-spmd-run
http://www.open-mp
can you try not specifiyng "max-slots" in the hostfile.
if you are the only user of the nodes, there will be no oversibscibing of
the processors.
This one definetly looks like a bug,
but as Ralph said there is a current disscusion and working on this
component.
Lenny.
On Mon, Aug 17, 2009 at 2:37
Added to the FAQ -- thanks!
On Aug 12, 2009, at 11:55 AM, Gabriele Fatigati wrote:
Dear OpenMPI developers,
referred to the follow problem:
http://openmpi.igor.onlinedirect.bg/faq/?category=troubleshooting#parallel-debugger-attach
me and Cristiano Calonaci have compiled openmpi 1.3.3 with in
Is there an explanation for this?
I believe the word is "bug". :-)
The rank_file mapper has been substantially revised lately - we are
discussing now how much of that revision to bring to 1.3.4 versus the
next major release.
Ralph
On Aug 17, 2009, at 4:45 AM, jody wrote:
Hi Lenny
I th
Hi Lenny
> I think it has something to do with your environment, /etc/hosts, IT setup,
> hostname function return value e.t.c
> I am not sure if it has something to do with Open MPI at all.
OK. I just thought this was Open MPI related because i was able to use the
aliases of the hosts (i.e. pla
I think it has something to do with your environment, /etc/hosts, IT setup,
hostname function return value e.t.c
I am not sure if it has something to do with Open MPI at all.
Lenny.
On Mon, Aug 17, 2009 at 12:59 PM, jody wrote:
> Hi Lenny
>
> Thanks - using the full names makes it work!
> Is the
Hi Lenny
Thanks - using the full names makes it work!
Is there a reason why the rankfile option treats
host names differently than the hostfile option?
Thanks
Jody
On Mon, Aug 17, 2009 at 11:20 AM, Lenny
Verkhovsky wrote:
> Hi
> This message means
> that you are trying to use host "plankton"
Hi
This message means
that you are trying to use host "plankton", that was not allocated via
hostfile or hostlist.
But according to the files and command line, everything seems fine.
Can you try using "plankton.uzh.ch" hostname instead of "plankton".
thanks
Lenny.
On Mon, Aug 17, 2009 at 10:36 AM,
Hi
When i use a rankfile, i get an error message which i don't understand:
[jody@plankton tests]$ mpirun -np 3 -rf rankfile -hostfile testhosts ./HelloMPI
--
Rankfile claimed host plankton that was not allocated or
oversubscr
Hi
http://www.open-mpi.org/faq/?category=tuning#using-paffinity
I am not familiar with this cluster, but in the FAQ ( see link above ) you
can find an example of the rankfile.
another simple example is the following:
$cat rankfile
rank 0=host1 slot=0
rank 1=host2 slot=0
rank 2=host3 slot=0
rank 3=h
21 matches
Mail list logo