On Mon, 13 Jun 2016 19:04:59 -0400
Mehmet Belgin wrote:
> Greetings!
>
> We have not upgraded our OFED stack for a very long time, and still
> running on an ancient version (1.5.4.1, yeah we know). We are now
> considering a big jump from this version to a tested and stable
> recent version an
HI,
Unless you provide Open MPI static libraries, you might not be required to
rebuild your apps.
You will likely have to / should rebuild OpenMPI though
Cheers,
Gilles
Peter Kjellström wrote:
>On Mon, 13 Jun 2016 19:04:59 -0400
>Mehmet Belgin wrote:
>
>> Greetings!
>>
>> We have not upgrad
Hello,
I have the following 3 1-socket nodes:
node1: 4GB RAM 2-core: rank 0 rank 1
node2: 4GB RAM 4-core: rank 2 rank 3 rank 4 rank 5
node3: 8GB RAM 4-core: rank 6 rank 7 rank 8 rank 9
I have a model that takes a input and produces a output, and I want to run
this model for N possible combi
Note if your program is synchronous, it will run at the speed of the
slowest task.
(E.g. Tasks on node 2, 1GB per task, will wait for the other tasks, 2 GB
per task)
You can use MPI_Comm_split_type in order to create inter node communicators.
Then you can find how much memory is available per task
Hi,
At work, i do have some mpi codes that make use of custom datatypes to
call MPI_File_read with MPI_BOTTOM ... It mostly works, except when
the underlying filesystem is NFS where if crash with SIGSEGV.
The attached sample (code + data) works just fine with 1.10.1 on my
NetBSD/amd64 workstatio
I dug into this a bit (with some help from others) and found that the spawn
code appears to be working correctly - it is the test in orte/test that is
wrong. The test has been correctly updated in the 2.x and master repos, but we
failed to backport it to the 1.10 series. I have done so this morn
On 2016-06-14, 3:42 AM, "users on behalf of Peter Kjellström"
wrote:
>On Mon, 13 Jun 2016 19:04:59 -0400
>Mehmet Belgin wrote:
>
>> Greetings!
>>
>> We have not upgraded our OFED stack for a very long time, and still
>> running on an ancient version (1.5.4.1, yeah we know). We are now
>> cons
Hello Grigory,
I am not sure what Redhat does exactly but when you install the OS, there is
always an InfiniBand Support module during the installation process. We
never check/install that module when we do OS installations because it is
usually several versions of OFED behind (almost obsolete).
Hi Ralph, et. al,
Great, thank you for the help. I downloaded the mpi loop spawn test
directly from what I think is the master repo on github:
https://github.com/open-mpi/ompi/blob/master/orte/test/mpi/loop_spawn.c
I am still using the mpi code from 1.10.2, however.
Is that test updated with the
Hmm…I’m unable to replicate a problem on my machines. What fabric are you
using? Does the problem go away if you add “-mca btl tcp,sm,self” to the mpirun
cmd line?
> On Jun 14, 2016, at 11:15 AM, Jason Maldonis wrote:
>
> Hi Ralph, et. al,
>
> Great, thank you for the help. I downloaded the m
That message is coming from udcm in the openib btl. It indicates some sort of
failure in the connection mechanism. It can happen if the listening thread no
longer exists or is taking too long to process messages.
-Nathan
On Jun 14, 2016, at 12:20 PM, Ralph Castain wrote:
Hmm…I’m unable to r
Hi,
I am attempting to use the sm and vader BTLs between a client and server
process, but it seems impossible to use fast transports (i.e. not TCP)
between two independent groups started with two separate mpirun
invocations. Am I correct, or is there a way to communicate using shared
memory betwee
Nope - we don’t currently support cross-job shared memory operations. Nathan
has talked about doing so for vader, but not at this time.
> On Jun 14, 2016, at 12:38 PM, Louis Williams
> wrote:
>
> Hi,
>
> I am attempting to use the sm and vader BTLs between a client and server
> process, but
I'm pretty sure it is based on a version:
[root@perceval2 prepJobs]# modinfo mlx4_core
filename:
/lib/modules/3.10.0-229.20.1.el7.x86_64/kernel/drivers/net/ethernet/mellanox/mlx4/mlx4_core.ko
version:2.2-1
license:Dual BSD/GPL
description:Mellanox ConnectX HCA low-level driver
Ralph, The problem *does* go away if I add "-mca btl tcp,sm,self" to the
mpiexec cmd line. (By the way, I am using mpiexec rather than mpirun; do
you recommend one over the other?) Will you tell me what this means for me?
For example, should I always append these arguments to mpiexec for my
non-tes
You don’t want to always use those options as your performance will take a hit
- TCP vs Infiniband isn’t a good option. Sadly, this is something we need
someone like Nathan to address as it is a bug in the code base, and in an area
I’m not familiar with
For now, just use TCP so you can move for
Thanks Ralph for all the help. I will do that until it gets fixed.
Nathan, I am very very interested in this working because we are developing
some new cool code for research in materials science. This is the last
piece of the puzzle for us I believe. I can use TCP for now though of
course. While
Hi Llolsten,
We are trying to keep up with the latest Open MPI as we can and keep a
few old versions around (oldest is 1.6), with RHEL 6.5. The OFED upgrade
will complement planned OS upgrades to the latest stable RHEL 6.x
(probably 6.7, not sure if we will go with 6.8).
Did you have to rec
18 matches
Mail list logo