+1
looks good.
On Wed, Apr 16, 2014 at 4:35 PM, Åke Sandgren wrote:
> On 04/16/2014 02:25 PM, Åke Sandgren wrote:
>
>> Hi!
>>
>> Found this problem when building r31409 with Pathscale 5.0
>>
>> pshmem_barrier.c:81:6: error: redeclaration of 'pshmem_barrier_all' must
>> have the 'overloadable' at
Hi,
I committed your patch to the trunk.
thanks
M
On Wed, Apr 16, 2014 at 6:49 PM, Mike Dubman wrote:
> +1
> looks good.
>
>
> On Wed, Apr 16, 2014 at 4:35 PM, Åke Sandgren
> wrote:
>
>> On 04/16/2014 02:25 PM, Åke Sandgren wrote:
>>
>>> Hi!
>>&g
Hi Timur,
What "configure" line you used? ikrit could be compile-it if no
"--with-mxm=/opt/mellanox/mxm" was provided.
Can you please attach your config.log?
Thanks
On Wed, Apr 23, 2014 at 3:10 PM, Тимур Исмагилов wrote:
> Hi!
> I am trying to build openmpi 1.8 with Open SHMEM and Mellanox M
I think it comes from PMI API used by OMPI/SLURM.
SLURM`s libpmi is trying to control stdout/stdin which is already
controlled by OMPI.
On Tue, May 27, 2014 at 8:31 PM, Ralph Castain wrote:
> I'm unaware of any OMPI error message like that - might be caused by
> something in libevent as that co
seems oshmem_info uses uninitialized value.
we will check it, thanks for report.
On Thu, Jun 5, 2014 at 6:56 PM, Timur Ismagilov wrote:
> Hello!
>
> I am using Open MPI v1.8.1.
>
> $oshmem_info -a --parsable | grep spml_ikrit_np
>
> mca:spml:ikrit:param:spml_ikrit_np:value:1620524368 (alwase n
could you please provide command line ?
On Fri, Jun 6, 2014 at 10:56 AM, Timur Ismagilov wrote:
> Hello!
>
> I am using Open MPI v1.8.1 in
> example program hello_oshmem.cpp.
>
> When I put spml_ikrit_np = 1000 (more than 4) and run task on 4 (2,1)
> nodes, I get an:
> in out file:
> No availa
fixed here: https://svn.open-mpi.org/trac/ompi/changeset/31962
Thanks for report.
On Thu, Jun 5, 2014 at 7:45 PM, Mike Dubman
wrote:
> seems oshmem_info uses uninitialized value.
> we will check it, thanks for report.
>
>
> On Thu, Jun 5, 2014 at 6:56 PM, Timur Ismagilov
>
could you please attach output of "ibv_devinfo -v" and "ofed_info -s"
Thx
On Sat, Jun 7, 2014 at 12:53 AM, Tim Miller wrote:
> Hi Josh,
>
> I asked one of our more advanced users to add the "-mca btl_openib_if_include
> mlx4_0:1" argument to his job script. Unfortunately, the same error
> occur
btw, the output comes from ompi`s libevent and not from slurm itself (sorry
about confusion and thanks to Yossi for catching this)
opal/mca/event/libevent2021/libevent/epoll.c:
event_warn("Epoll %s(%d) on fd %d failed. Old events were %d; read change
was %d (%s); write change was %d (%s)",
opal/
Hi
what ofed/mofed are you using? what HCA, distro and command line?
M
On Wed, Jun 25, 2014 at 1:40 AM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:
> What are your threading options for OpenMPI (when it was built) ?
>
> I have seen OpenIB BTL completely lock when some le
please add following flags to mpirun "--mca plm_base_verbose 10
--debug-daemons" and attach output.
Thx
On Wed, Jul 16, 2014 at 11:12 AM, Timur Ismagilov
wrote:
> Hello!
> I have Open MPI v1.9a1r32142 and slurm 2.5.6.
>
> I can not use mpirun after salloc:
>
> $salloc -N2 --exclusive -p test -J
Hi,
The openib btl is not compatible with "thread multiple" paradigm.
You need to use mxm (lib on top of verbs) for ompi and threads.
mxm is part of MOFED or you can download HPCX package (tarball of ompi +
mxm) from http://mellanox.com/products/hpcx
M
On Thu, Jul 24, 2014 at 1:06 PM, madhurima
You can use hybrid mode.
following code works for me with ompi 1.8.2
#include
#include
#include "shmem.h"
#include "mpi.h"
int main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
start_pes(0);
{
int version = 0;
int subversion = 0;
int num_proc = 0;
Hi,
what ofed version do you use?
(ofed_info -s)
On Sun, Aug 17, 2014 at 7:16 PM, Rio Yokota wrote:
> I have recently upgraded from Ubuntu 12.04 to 14.04 and OpenMPI gives the
> following warning upon execution, which did not appear before the upgrade.
>
> WARNING: It appears that your OpenFabr
most likely you installing old ofed which does not have this parameter:
try:
#modinfo mlx4_core
and see if it is there.
I would suggest install latest OFED or Mellanox OFED.
On Mon, Aug 18, 2014 at 9:53 PM, Rio Yokota wrote:
> I get "ofed_info: command not found". Note that I don't install t
so, it seems you have old ofed w/o this parameter.
Can you install latest Mellanox ofed? or check which community ofed has it?
On Tue, Aug 19, 2014 at 9:34 AM, Rio Yokota wrote:
> Here is what "modinfo mlx4_core" gives
>
> filename:
>
> /lib/modules/3.13.0-34-generic/kernel/drivers/net/ether
btw, we get same error in v1.8 branch as well.
On Wed, Aug 20, 2014 at 8:06 PM, Ralph Castain wrote:
> It was not yet fixed - but should be now.
>
> On Aug 20, 2014, at 6:39 AM, Timur Ismagilov wrote:
>
> Hello!
>
> As i can see, the bug is fixed, but in Open MPI v1.9a1r32516 i still have
> t
Hi FIlippo,
I think you can use SLURM_LOCALID var (at least with slurm v14.03.4-2)
$srun -N2 --ntasks-per-node 3 env |grep SLURM_LOCALID
SLURM_LOCALID=1
SLURM_LOCALID=2
SLURM_LOCALID=0
SLURM_LOCALID=0
SLURM_LOCALID=1
SLURM_LOCALID=2
$
Kind Regards,
M
On Thu, Aug 21, 2014 at 9:27 PM, Ralph Cas
2 AM, Timur Ismagilov > wrote:
>
> Have i I any opportunity to run mpi jobs?
>
>
> Wed, 20 Aug 2014 10:48:38 -0700 от Ralph Castain >:
>
> yes, i know - it is cmr'd
>
> On Aug 20, 2014, at 10:26 AM, Mike Dubman
> wrote:
>
> btw, we get same error
btw, you may want to use latest mxm v3.1 which is part of hpcx package
http://www.mellanox.com/products/hpcx
On Thu, Aug 28, 2014 at 4:10 AM, Brock Palen wrote:
> Brice, et al.
>
> Thanks a lot for this info. We are setting up new builds of OMPI 1.8.2
> with knem and mxm 3.0,
>
> If we have qu
Hi,
yep - you can compile OFED/MOFED in the $HOME/ofed dir and point OMPI
configure to it with "--with-verbs=/path/to/ofed/install".
You can download and
compile
"libibverbs","libibumad","libibmad","librdmacm","opensm","infiniband-diags"
packages only with custom prefix.
M
On Fri, Oct 10, 2014
Hi,
the default memheap size is 256MB, you can override it with oshrun -x
SHMEM_SYMMETRIC_HEAP_SIZE=512M ...
On Mon, Nov 17, 2014 at 3:38 PM, Timur Ismagilov wrote:
> Hello!
> Why does shmalloc return NULL when I try to allocate 512MB.
> When i thry to allocate 256mb - all fine.
> I use Open MPI
Hi Siegmar,
Could you please check the /etc/mtab file for real FS type for the
following mount points:
get_mounts: dirs[16]:/misc fs:autofs nfs:No
get_mounts: dirs[17]:/net fs:autofs nfs:No
get_mounts: dirs[18]:/home fs:autofs nfs:No
could you please check if mntent.h and paths.h were detected by
Hi,
also - you can use mxm library (which support RC,UD,DC and mixes) and comes
as part of Mellanox OFED.
The version for community OFED is also available from
http://mellanox.com/products/hpcx
On Fri, Jan 9, 2015 at 4:03 PM, Sasso, John (GE Power & Water, Non-GE) <
john1.sa...@ge.com> wrote:
>
Hi,
openib btl does not support this thread model.
You can use OMPI w/ mxm (-mca mtl mxm) and multiple thread mode lin 1.8 x
series or (-mca pml yalla) in the master branch.
M
On Mon, Mar 30, 2015 at 9:09 AM, Subhra Mazumdar
wrote:
> Hi,
>
> Can MPI_THREAD_MULTIPLE and openib btl work together
Hey Eloi,
What HCA card do you have ? Can you post code/instructions howto reproduce
it?
10x
Mike
On Mon, Aug 9, 2010 at 5:22 PM, Eloi Gaudry wrote:
> Hi,
>
> Could someone have a look on these two different error messages ? I'd like
> to know the reason(s) why they were displayed and their act
Hi,
What interconnect and command line do you use? For InfiniBand openib
component there is a known issue with large transfers (2GB)
https://svn.open-mpi.org/trac/ompi/ticket/2623
try disabling memory pinning:
http://www.open-mpi.org/faq/?category=openfabrics#large-message-leave-pinned
regards
It happens for us on RHEL 6.0
On Tue, Jan 17, 2012 at 3:46 AM, Ralph Castain wrote:
> Well, I'm afraid I can't replicate your report. It runs fine for me.
>
> Sent from my iPad
>
> On Jan 16, 2012, at 4:25 PM, Ralph Castain wrote:
>
> > Hprobably a bug. I haven't tested that branch yet.
it was compiled with the same ompi.
We see it occasionally on different clusters with different ompi folders.
(all v1.5)
On Thu, Jan 19, 2012 at 5:44 PM, Ralph Castain wrote:
> I didn't commit anything to the v1.5 branch yesterday - just the trunk.
>
> As I told Mike off-list, I think it may hav
so far did not happen yet - will report if it does.
On Tue, Jan 24, 2012 at 5:10 PM, Jeff Squyres wrote:
> Ralph's fix has now been committed to the v1.5 trunk (yesterday).
>
> Did that fix it?
>
>
> On Jan 22, 2012, at 3:40 PM, Mike Dubman wrote:
>
> > it
you need latest OMPI 1.6.x and latest MXM (
ftp://bgate.mellanox.com/hpc/mxm/v1.1/mxm_1.1.1067.tar)
On Wed, May 9, 2012 at 6:02 AM, Derek Gerstmann
wrote:
> What versions of OpenMPI and the Mellanox MXM libraries have been tested
> and verified to work?
>
> We are currently trying to build Open
17820.58
> 524288 4604.16 8781.74
> 1048576 4635.51 4420.77
> 2097152 3575.17 1704.78
> 4194304 2828.19674.29
>
> Thanks!
>
> -[dg]
>
> Derek Gerstmann, PhD Student
&
Hi,
Could you please download latest mxm from
http://www.mellanox.com/products/mxm/ and retry?
The mxm version which comes with OFED 1.5.3 was tested with OMPI 1.6.0.
Regards
M
On Wed, Aug 22, 2012 at 2:22 PM, Pavel Mezentsev
wrote:
> I've tried to launch the application on nodes with QDR Infini
You need mxm-1.1.3a5e745-1.x86_64-**rhel6u3.rpm
On Wed, Nov 28, 2012 at 7:44 PM, Joseph Farran wrote:
> mxm-1.1.3a5e745-1.x86_64-**rhel6u3.rpm
>
Hi Joseph,
I guess you install MOFED under /usr, is that right?
Could you please specify "--with-openib=/usr" parameter during ompi
"configure" stage?
10x
M
On Fri, Nov 30, 2012 at 1:11 AM, Joseph Farran wrote:
> Hi YK:
>
> Yes, I have those installed but they are newer versions:
>
> # rpm -qa |
Hi,
The mxm which is part of MOFED 1.5.3 supports OMPI 1.6.0.
The mxm upgrade is needed to work with OMPI 1.6.3+
Please remove mxm from your cluster nodes (rpm -e mxm)
Install latest from http://mellanox/com/products/mxm/
Compile ompi 1.6.3, add following to its configure line: ./configure
--wi
a=/opt/mellanox/fca\
> --with-mxm-libdir=/opt/mellanox/mxm/lib \
> --with-mxm=/opt/mellanox/mxm\
> --prefix=/data/openmpi-1-6.3
>
> Please advise,
> Joseph
>
>
>
>
>
>
> On 12/1/2012 11:39 PM, Mike Dubman wrote:
>
> Hi Jos
\
>> --enable-openib-connectx-xrc\
>> --enable-mpi-thread-multiple\
>> --with-threads \
>> --with-hwloc\
>> --enable-heterogeneous \
>> --wi
please redownload from
http://mellanox.com/downloads/hpc/mxm/v1.1/mxm-latest.tar
it contains binaries compiled with mofed 1.5.3-3.1.0
M
On Sun, Dec 2, 2012 at 12:13 PM, Mike Dubman wrote:
>
> It seems that your active mofed is 1.5.3-3.1.0, while installed mxm was
> compiled with 1.
ohh.. you have MOFED 1.5.4.1, thought it was 1.5.3-3.1.0
will provide you a link to mxm package compiled with this MOFED version
(thanks to no ABI in OFED).
On Sun, Dec 2, 2012 at 10:04 PM, Joseph Farran wrote:
> 1.5.4.1
Please download http://mellanox.com/downloads/hpc/mxm/v1.1/mxm-latest.tar,
it contains mxm.rpm for mofed 1.5.4.1
On Mon, Dec 3, 2012 at 8:18 AM, Mike Dubman wrote:
> ohh.. you have MOFED 1.5.4.1, thought it was 1.5.3-3.1.0
> will provide you a link to mxm package compiled with this
Hi Francesco,
Can you please provide complete output from ibv_devinfo -v command?
Also, it seems that you have Centos 5.8 with mxm/centos5.7 installed, will
check if there is a distro version incompatibilities which may cause it and
update you.
Alina/Josh - please follow.
Regards
M
On Thu, Jan 1
Also, what MOFED/OFED version do you have?
MXM is compiled per OFED/MOFED version, is there match between active ofed
and mxm.rpm selected?
On Thu, Jan 17, 2013 at 4:09 PM, Francesco Simula <
francesco.sim...@roma1.infn.it> wrote:
> I tried building from OMPI 1.6.3 tarball with the following ./co
--mca btl_openib_ib_path_record_**service_level 1 flag controls openib btl,
you need to remove --mca mtl mxm from command line.
Have you compiled OpenMPI with rhel6.4 inbox ofed driver? AFAIK, the MOFED
2.x does not have XRC and you mentioned "--enable-openib-connectx-xrc" flag
in configure.
O
Also, what ofed version (ofed_info -s) and mxm version (rpm -qi mxm) do you
use?
On Wed, Jun 12, 2013 at 3:30 AM, Ralph Castain wrote:
> Great! Would you mind showing the revised table? I'm curious as to the
> relative performance.
>
>
> On Jun 11, 2013, at 4:53 PM, eblo...@1scom.net wrote:
>
>
Hi,
I would suggest use MXM (part of mofed, can be downloaded as standalone rpm
from http://mellanox.com/products/mxm for ofed)
It uses UD (constant memory footprint) and should provide good performance.
The next MXM v2.0 will support RC and DC (reliable UD) as well.
Once mxm is installed from rp
do you use IB as a transport? max message size in IB/RDMA is limited to
2G, but OMPI 1.7 splits large buffers during RDMA into 2G chunks.
On Wed, Jul 17, 2013 at 11:51 AM, mohammad assadsolimani <
m.assadsolim...@jesus.ch> wrote:
>
> Dear all,
>
> I do my PhD in physics and use a program, whi
Hi,
What OFED vendor and version do you use?
Regards
M
On Tue, Jul 30, 2013 at 8:42 PM, Paul Kapinos wrote:
> Dear Open MPI experts,
>
> An user at our cluster has a problem running a kinda of big job:
> (- the job using 3024 processes (12 per node, 252 nodes) runs fine)
> - the job using 4032 p
maybe to add some nice/funny slogan on the front under the logo, and cool
picture on the back.
some of community members are still in early twenties (and counting) .
:)
shall we open a contest for good slogan to put? and mid-size pict to put on
the back side?
- living the parallel world
- iO
ice.
>
> :-)
>
> Damien
>
>
> On 23/10/2013 4:26 PM, Shamis, Pavel wrote:
>
>> +1 for Chuck Norris
>>
>> Pavel (Pasha) Shamis
>> ---
>> Computer Science Research Group
>> Computer Science and Math Division
>> Oak Ridge National Labora
While I enjoy the
> enthusiasm, I actually suspect we would get into trouble using Chuck
> Norris' name without first obtaining his permission.
>
> On Oct 25, 2013, at 2:28 AM, Mike Dubman wrote:
>
> ok, so - here is a final proposal:
>
> front:
> small OMPI logo,
Hi,
Can it be that libibmad/libibumad installed on your system belongs to
previous mofed installation?
Thanks
M.
On Jan 31, 2014 2:02 AM, "Brock Palen" wrote:
> I grabbed the latest FCA release from Mellnox's website. We have been
> building against FCA 2.5 for a while, but it never worked righ
Hi,
after this patch we get this in jenkins:
*07:03:15* [vegas12.mtr.labs.mlnx:01646] [[26922,0],0] ORTE_ERROR_LOG:
Not implemented in file rmaps_mindist_module.c at line 391*07:03:15*
[vegas12.mtr.labs.mlnx:01646] [[26922,0],0] ORTE_ERROR_LOG: Not
implemented in file base/rmaps_base_map_job.c at
Thanks for prompt help.
Could you please resent the patch as attachment which can be applied with
"patch" command, my mail client messes long lines.
On Fri, Feb 14, 2014 at 7:40 AM, wrote:
>
>
> Thanks. I'm not familiar with mindist mapper. But obviously
> checking for ORTE_MAPPING_BYDIST is mi
Hello Ralph,
It seems that Option2 is preferred, because it is more intuitive for
end-user to create rankfile for mpi job, which is described by -app cmd
line.
All hosts definitions used inside -app , will be treated like a single
global hostlist combined from all hosts appearing inside "-app fil
Hello guys,
When executing following command with mtt and ompi 1.3.3:
mpirun --host
witch15,witch15,witch15,witch15,witch16,witch16,witch16,witch16,witch17,witch17,witch17,witch17,witch18,witch18,witch18,witch18,witch19,witch19,witch19,witch19
-np 20 --mca btl_openib_use_srq 1 --mca btl self
band rdma? Also from programming perspective,
> do I need to use anything else other than MPI_Send/MPI_Recv?
>
> Thanks,
> Subhra.
>
>
> On Sun, Mar 29, 2015 at 11:14 PM, Mike Dubman
> wrote:
>
>> Hi,
>> openib btl does not support this thread model.
>> You
ult?
>
> Thanks,
> Subhra.
>
>
> On Tue, Mar 31, 2015 at 9:46 AM, Mike Dubman
> wrote:
>
>> Hi,
>> mxm uses IB rdma/roce technologies. Once can select UD/RC/DC transports
>> to be used in mxm.
>>
>> By selecting mxm, all MPI p2p routines will be
> --
> mpirun noticed that process rank 0 with PID 8398 on node JARVICE exited on
> signal 11 (Segmentation fault).
> ---
egmentation fault).
> --
> [JARVICE:00562] 1 more process has sent help message help-mca-base.txt /
> find-available:not-valid
> [JARVICE:00562] Set MCA parameter "orte_base_help_aggregate" to 0 to see
>
Hi,
With MXM, you can specify list of devices to use for communication:
-x MXM_IB_PORTS="mlx5_1:1,mlx4_1:1"
also select specific or all transpoirts:
-x MXM_TLS=shm,self,ud
To change port rate one can use *ibportstate*
*http://www.hpcadvisorycouncil.com/events/2011/switzerland_workshop/pdf/Pres
==
>
> --
>
> mpirun noticed that process rank 1 with PID 450 on node JARVICE exited on
> signal 11 (Segmentation fault).
>
> --
>
> ibv_query_device() returned 38: Function not implemented
> --
> Initialization of MXM library failed.
>
> Error: Input/output error
>
>
hra.
>
> On Tue, Apr 21, 2015 at 10:43 PM, Mike Dubman
> wrote:
>
>> cool, progress!
>>
>> >>1429676565.124664] sys.c:719 MXM WARN Conflicting CPU
>> frequencies detected, using: 2601.00
>>
>> means that cpu governor on your machine is
btw, ompi master now calls ibv_fork_init() before initializing btl/mtl/oob
frameworks and all fork fears should be addressed.
On Fri, Apr 24, 2015 at 4:37 AM, Jeff Squyres (jsquyres) wrote:
> Disable the memory manager / don't use leave pinned. Then you can
> fork/exec without fear (because on
mtl ^mxm -n 1 /root/backend
> localhost : -x LD_PRELOAD=/root/libci.so -n 1 /root/app2
>
> Seems like it doesn't matter if I use mxm, not use mxm or use it with
> reliable connection (RC). How can I be sure I am indeed using mxm over
> infiniband?
>
> Thanks,
> Subhra.
iband but in different ways?
>
> Thanks,
> Subhra.
>
>
>
> On Thu, Apr 23, 2015 at 11:57 PM, Mike Dubman
> wrote:
>
>> HPCX package uses pml "yalla" by default (part of ompi master branch, not
>> in v1.8).
>> So, "-mca mtl mxm" has no effec
// in the child
> *buffer = 3;
> // ...
> }
> ----
>
>
>
> > On Apr 24, 2015, at 2:54 AM, Mike Dubman
> wrote:
> >
> > btw, ompi master now calls ibv_fork_init() before initializing
> btl/mtl/oob frameworks and all fork fears should be addressed.
> >
verbs (which was admittedly a long
> time ago), the sample I pasted would segv...
>
>
> > On Apr 24, 2015, at 9:40 AM, Mike Dubman
> wrote:
> >
> > ibv_fork_init() will set special flag for madvise()
> (IBV_DONTFORK/DOFORK) to inherit (and not cow) registered/locked
eck that I am
> indeed using #2 ?
>
> Subhra
>
> On Fri, Apr 24, 2015 at 12:55 AM, Mike Dubman
> wrote:
>
>> yes
>>
>> #1 - ob1 as pml, openib openib as btl (default: rc)
>> #2 - yalla as pml, mxm as IB library (default: ud, use "-x
>> MXM_TLS
Hi,
How mxm was installed? by copying?
The rpm based installation places mxm into /opt/mellanox/mxm and not into
/usr/lib64/libmxm.so.
Do you use HPCx (pack of OMPI and MXM and FCA)?
You can download HPCX, extract it anywhere and compile OMPI pointing to mxm
location under HPCX.
Also, HPCx cont
Hi Timur,
seems that yalla component was not found in your OMPI tree.
can it be that your mpirun is not from hpcx? Can you please check
LD_LIBRARY_PATH,PATH, LD_PRELOAD and OPAL_PREFIX that it is pointing to the
right mpirun?
Also, could you please check that yalla is present in the ompi_info -l 9
e_mtu: 4096 (5)
> sm_lid: 0
> port_lid: 0
> port_lmc: 0x00
>
> Best regards,
> Timur.
>
>
> Понедельник, 25 мая 2015, 19:39 +03:00 от Mike Dubman
btw, what is a rationale to run in chroot env? is it dockers-like env?
does "ibv_devinfo -v" works for you from chroot env?
On Tue, May 26, 2015 at 7:08 AM, Rahul Yadav wrote:
> Yes Ralph, MXM cards are on the node. Command runs fine if I run it out of
> the chroot environment.
>
> Thanks
> R
ed from the linking commands and make
> completed fine.
>
> So, it looks like there are two solutions: move the install location of
> mxm to not be in system-space or modify configure. Which one would be the
> better one for me to pursue?
>
> Thanks,
> David
>
>
>
David,
Could you please send me your config.log file?
Looking into config/ompi_check_mxm.m4 macro I don`t understand how it could
happen.
Thanks a lot.
On Tue, May 26, 2015 at 6:41 PM, Mike Dubman
wrote:
> Hello David,
> Thanks for info and patch - will fix ompi configure logic wit
/blob/master/config/ompi_check_mxm.m4#L41
>
> doesn't check to see if $ompi_check_mxm_libdir is empty.
>
>
> > On May 26, 2015, at 11:50 AM, Mike Dubman
> wrote:
> >
> > David,
> > Could you please send me your config.log file?
> >
> > Looking i
ly). Thus, ompi_check_mxm_libdir never gets assigned which
> results in just "-L" getting used on line 41. The same behavior could be
> found by using '--with-mxm=yes'.
>
> Thanks,
> David
>
>
> On 05/26/2015 11:28 AM, Mike Dubman wrote:
>
libdir will be empty.
>
> Right?
>
>
> > On May 26, 2015, at 1:28 PM, Mike Dubman
> wrote:
> >
> > Thanks Jeff!
> >
> > but in this line:
> >
> >
> https://github.com/open-mpi/ompi/blob/master/config/ompi_check_mxm.m4#L36
> >
> >
e is empty, and
> you just end up appending "-L" instead of "-L/something". So why not just
> check to ensure that the variable is not empty?
>
>
>
> > On May 26, 2015, at 3:27 PM, Mike Dubman
> wrote:
> >
> > in that case, O
?
> "-x LD_PRELOAD=$HPCX_MXM_DIR/debug/lib/libmxm.so -x MXM_LOG_LEVEL=data"
>
> Also, could you please attach the entire output of
> "$HPCX_MPI_DIR/bin/ompi_info -a"
>
> Thank you,
> Alina.
>
> On Tue, May 26, 2015 at 3:39 PM, Mike Dubman <https
Hi,
the message in question belongs to MXM and it is warning (silenced in
latter releases of MXM).
To select specific device in MXM, please pass:
mpirun -x MXM_IB_PORTS=mlx4_0:2 ...
M
On Wed, Jun 17, 2015 at 9:38 PM, Na Zhang wrote:
> Hi all,
>
> I am trying to launch MPI jobs (with version o
Hello Grigory,
We observed ~10% performance degradation with heap size set to unlimited
for CFD applications.
You can measure your application performance with default and unlimited
"limits" and select the best setting.
Kind Regards.
M
On Mon, Sep 28, 2015 at 7:36 PM, Grigory Shamov wrote:
>
th the value and see
> >what works for your applications. Most applications should be using
> >malloc or similar functions to allocate large memory regions in the heap
> >and not on the stack.
> >
> >-Nathan
> >
> >On Mon, Sep 28, 2015 at 08:01:09PM +0300
what is your command line and setup? (ofed version, distro)
This is what was just measured w/ fdr on haswell with v1.8.8 and mxm and UD
+ mpirun -np 2 -bind-to core -display-map -mca rmaps_base_mapping_policy
dist:span -x MXM_RDMA_PORTS=mlx5_3:1 -mca rmaps_dist_device mlx5_3:1 -x
MXM_TLS=self,sh
we did not get to the bottom for "why".
Tried different mpi packages (mvapich,intel mpi) and the observation hold
true.
it could be many factors affected by huge heap size (cpu cache misses?
swapness?).
On Wed, Sep 30, 2015 at 1:12 PM, Dave Love wrote:
> Mike Dubman writes
mxm comes with mxm_dump_config utility which provides and explains all
tunables.
Please check HPCX/README file for details.
On Wed, Sep 30, 2015 at 1:21 PM, Dave Love wrote:
> Mike Dubman writes:
>
> > unfortunately, there is no one size fits all here.
> >
> > mxm provi
ance implications. Please set the heap size to the default value
> (10240)
>
> Should say stack not heap.
>
> -Nathan
>
> On Wed, Sep 30, 2015 at 06:52:46PM +0300, Mike Dubman wrote:
> >mxm comes with mxm_dump_config utility which provides and explains all
> >
well. (there
is a reason that any MPI have hundreds of knobs)
On Thu, Oct 1, 2015 at 1:50 PM, Dave Love wrote:
> Mike Dubman writes:
>
> > we did not get to the bottom for "why".
> > Tried different mpi packages (mvapich,intel mpi) and the observation hold
> &g
these flags available in master and v1.10 branches and make sure that ranks
to core allocation is done starting from cpu socket closer to the HCA.
Of course you can have same effect with taskset.
On Mon, Oct 5, 2015 at 8:46 PM, Dave Love wrote:
> Mike Dubman writes:
>
> > what is
Hi David,
what linux distro do you use? (and mofed version)?
Do you have /etc/ld.conf.d/mxm.conf file?
Can you please try add LD_LIBRARY_PATH=/opt/mellanox/mxm/lib ./configure
?
Thanks
On Wed, Oct 21, 2015 at 6:40 PM, David Shrader wrote:
> I should probably point out that libhcoll.so does
re configure got it to work, which I didn't
> expect. Thanks for the tip! I didn't realize that loading in a shared
> library of a library that is being linked in on the active compile line
> fell under the runtime portion of linking, and could be affected by using
> LD_LIB
Hi,
hcoll is part of MOFED or comes from HPCx.
what version of hcoll do you have on your system?
Thx
On Wed, Dec 23, 2015 at 4:58 AM, Ben Menadue wrote:
> Hi,
>
> It's probably in plain sight somewhere and I missed it, but is there a
> minimum version of hcoll needed to build 1.10.1?
>
> We hav
>
>
> ../../../../../../../../ompi/mca/coll/hcoll/coll_hcoll_module.c:263:
> warning: implicit declaration of function
> 'hcoll_check_mem_release_cb_needed'
>
>
>
> Cheers,
>
> Ben
>
>
>
> *From:* users [mailto:users-boun...@open-mpi.org] *On Behalf Of *
Hi,
it seems that your ompi was compiled with ofed ver X but running on ofed
ver Y.
X and Y are incompatible.
On Mon, Feb 22, 2016 at 8:18 PM, Mark Potter wrote:
> I am usually able to find the answer to my problems by searching the
> archive but I've run up against one that I can't suss out.
>
using 2 HCAs on the same PCI-Exp bus (as well as 2 ports from the same HCA)
will not improve performance, PCI-Exp is the bottleneck.
On Mon, Oct 20, 2008 at 2:28 AM, Mostyn Lewis wrote:
> Well, here's what I see with the IMB PingPong test using two ConnectX DDR
> cards
> in each of 2 machines.
96 matches
Mail list logo