Okay, fixed in r23499. Thanks again...
On Jul 26, 2010, at 9:47 PM, Ralph Castain wrote:
> Doh - yes it should! I'll fix it right now.
>
> Thanks!
>
> On Jul 26, 2010, at 9:28 PM, Philippe wrote:
>
>> Ralph,
>>
>> i was able to test the generic module and it seems to be working.
>>
>> one q
Thanks a lot!
now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our
nodes have a short/long name (it's rhel 5.x, so the command hostname
returns the long name) and at least 2 IP addresses.
p.
On Tue, Jul 27, 2010 at 12:06 AM, Ralph Castain wrote:
> Okay, fixed in r23499. Thanks agai
Use what hostname returns - don't worry about IP addresses as we'll discover
them.
On Jul 26, 2010, at 10:45 PM, Philippe wrote:
> Thanks a lot!
>
> now, for the ev "OMPI_MCA_orte_nodes", what do I put exactly? our
> nodes have a short/long name (it's rhel 5.x, so the command hostname
> returns
A clarification from your previous email, you had your code working with
OMPI 1.4.1 but an older version of OFED? Then you upgraded to OFED 1.4
and things stopped working? Sounds like your current system is set up
with OMPI 1.4.2 and OFED 1.5. Anyways, I am a little confused as to
when thing
On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote:
> 8 integer, parameter :: dp = kind(1.d0)
> 9 real(kind=dp) :: inside(5), outside(5)
I'm not a fortran expert -- is kind(1.d0) really double precision? According
to http://gcc.gnu.org/onlinedocs/gcc-3.4.6/g77/Kind-Notation.htm
On Tue, 2010-07-27 at 08:11 -0400, Jeff Squyres wrote:
> On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote:
>
> > 8 integer, parameter :: dp = kind(1.d0)
> > 9 real(kind=dp) :: inside(5), outside(5)
>
> I'm not a fortran expert -- is kind(1.d0) really double precision? Accordin
On Tue, Jul 27, 2010 at 08:11:39AM -0400, Jeff Squyres wrote:
> On Jul 26, 2010, at 11:06 PM, Hugo Gagnon wrote:
>
> > 8 integer, parameter :: dp = kind(1.d0)
> > 9 real(kind=dp) :: inside(5), outside(5)
>
> I'm not a fortran expert -- is kind(1.d0) really double precision? A
Both 1.4.1 and 1.4.2 exhibit the same behaviors w/ OFED 1.5. It wasn't
OFED 1.4 after all (after some more digging around through our update
logs).
All of the ibv_*_pingpong tests appear to work correctly. I'll try
running a few more tests (np=2 over two nodes, some of the OSU
benchmarks, etc.)
After running on two processors across two nodes, the problem occurs
much earlier during execution:
(gdb) bt
#0 opal_sys_timer_get_cycles ()
at ../opal/include/opal/sys/amd64/timer.h:46
#1 opal_timer_base_get_cycles ()
at ../opal/mca/timer/linux/timer_linux.h:31
#2 opal_progress () at runtime/o
Can you try a simple point-to-point program?
--td
Brian Smith wrote:
After running on two processors across two nodes, the problem occurs
much earlier during execution:
(gdb) bt
#0 opal_sys_timer_get_cycles ()
at ../opal/include/opal/sys/amd64/timer.h:46
#1 opal_timer_base_get_cycles ()
at .
I am out of the office until 08/02/2010.
I will respond to your message when I return.
Note: This is an automated response to your message "users Digest, Vol
1642, Issue 1" sent on 7/27/10 9:32:11 AM.
This is the only notification you will receive while this person is away.
I appreciate your replies but my question has to do with the function
MPI_Allreduce of OpenMPI built on a Mac OSX 10.6 with ifort (intel
fortran compiler).
--
Hugo Gagnon
On Tue, 27 Jul 2010 13:23 +0100, "Anton Shterenlikht"
wrote:
> On Tue, Jul 27, 2010 at 08:11:39AM -0400, Jeff Squyres wrot
Hi, Terry,
I just ran through the entire gamut of OSU/OMB tests -- osu_bibw
osu_latency osu_multi_lat osu_bw osu_alltoall osu_mbw_mr osu_bcast -- on
various nodes on one of our clusters (at least two nodes per job) w/
version 1.4.2 and OFED 1.5 (executables and mpi compiled w/ gcc 4.4.2)
and haven
So now I have a new question.
When I run my server and a lot of clients on the same machine,
everything looks fine.
But when I try to run the clients on several machines the most
frequent scenario is:
* server is stared on machine A
* X (= 1, 4, 10, ..) clients are started on machine B and they co
Try mpi_real8 for the type in allreduce
On 7/26/10, Hugo Gagnon wrote:
> Hello,
>
> When I compile and run this code snippet:
>
> 1 program test
> 2
> 3 use mpi
> 4
> 5 implicit none
> 6
> 7 integer :: ierr, nproc, myrank
> 8 integer, parameter :: d
This slides outside of my purview - I would suggest you post this question with
a different subject line specifically mentioning failure of intercomm_merge to
work so it attracts the attention of those with knowledge of that area.
On Jul 27, 2010, at 9:30 AM, Grzegorz Maj wrote:
> So now I hav
No, we really shouldn't. Having just fought with a program using usleep(1)
which was behaving even worse, working around this particular inability of the
Linux kernel development team to do something sane will only lead to more pain.
There are no good options, so the best option is to not try
based on your output shown here, there is absolutely nothing wrong
(yet). Both processes are in the same function and do what they are
supposed to do.
However, I am fairly sure that the client process bt that you show is
already part of current_intracomm. Could you try to create a bt of the
proces
Also, the application I'm having trouble with appears to work fine with
MVAPICH2 1.4.1, if that is any help.
-Brian
On Tue, 2010-07-27 at 10:48 -0400, Terry Dontje wrote:
> Can you try a simple point-to-point program?
>
> --td
>
> Brian Smith wrote:
> > After running on two processors across t
Hi Hugo, David, Jeff, Terry, Anton, list
I suppose maybe we're guessing that somehow on Hugo's iMac
MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0),
hence the segmentation fault on MPI_Allreduce.
Question:
Is there a simple way to check the number of bytes associated to eac
With this earlier failure do you know how many message may have been
transferred between the two processes? Is there a way to narrow this
down to a small piece of code? Do you have totalview or ddt at your
disposal?
--td
Brian Smith wrote:
Also, the application I'm having trouble with appe
Hi,
Even when executing a hello world openmpi, i get this error, which is then
ignored.
fcluster@fuego:~$ mpirun --hostfile myhostfile -np 5 testMPI/hola
[agua:02357] mca: base: component_find: unable to open
/opt/openmpi-1.4.2/lib/openmpi/mca_btl_ofud: perhaps a missing symbol, or
compiled for a
Hi Cristobal
Try using the --prefix option of mpiexec.
"man mpiexec" is your friend!
Alternatively, append the OpenMPI directories to your
PATH *and* LD_LIBRARY_PATH on your .bashrc/.csrhc file
See this FAQ:
http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path
I hope it helps,
Gus
Thanks Gus,
but i already had the paths
fcluster@agua:~$ echo $PATH
/opt/openmpi-1.4.2/bin:/opt/cfc/sge/bin/lx24-amd64:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
fcluster@agua:~$ echo $LD_LIBRARY_PATH
/opt/openmpi-1.4.2/lib:
fcluster@agua:~$
even weird, errors come s
i compiled with absolute path in case:
fcluster@agua:~$ /opt/openmpi-1.4.2/bin/mpicc testMPI/hello.c -o
testMPI/hola
fcluster@agua:~$ mpirun --hostfile myhostfile -np 5 testMPI/hola
[agua:03547] mca: base: component_find: unable to open
/opt/openmpi-1.4.2/lib/openmpi/mca_btl_ofud: perhaps a missing
Hi Cristobal
Does it run only on the head node alone?
(Fuego? Agua? Acatenango?)
Try to put only the head node on the hostfile and execute with mpiexec.
This may help sort out what is going on.
Hopefully it will run on the head node.
Also, do you have Infinband connecting the nodes?
The error me
I did and it runs now, but the result is wrong: outside is still 1.d0,
2.d0, 3.d0, 4.d0, 5.d0
How can I make sure to compile OpenMPI so that datatypes such as
mpi_double_precision behave as they "should"?
Are there flags during the OpenMPI building process or something?
Thanks,
--
Hugo Gagnon
On Tue, Jul 27, 2010 at 7:29 PM, Gus Correa wrote:
> Hi Cristobal
>
> Does it run only on the head node alone?
> (Fuego? Agua? Acatenango?)
> Try to put only the head node on the hostfile and execute with mpiexec.
>
--> i will try only with the head node, and post results back
> This may help so
Hi,
I have some performance issue on a parallel machine composed of nodes of 16
procs each. The application is launched on multiple of 16 procs for given
numbers of nodes.
I was told by people using MX MPI with this machine to attach a script to
mpiexec, which 'numactl' things, in order to make
On Tue, 2010-07-27 at 16:19 -0400, Gus Correa wrote:
> Hi Hugo, David, Jeff, Terry, Anton, list
>
> I suppose maybe we're guessing that somehow on Hugo's iMac
> MPI_DOUBLE_PRECISION may not have as many bytes as dp = kind(1.d0),
> hence the segmentation fault on MPI_Allreduce.
>
> Question:
>
>
30 matches
Mail list logo