Please fix the hcoll test (and code) to be correct.
Any configure test that adds /usr/lib and/or /usr/include to any compile
flags is broken.
And if hcoll include files are under $HCOLL_HOME/include/hcoll (and
hcoll/api) then the include directives in the source should be
#include
and
#incl
i do not know the context, so i should not jump to any conclusion ...
if xxx.h is in $HCOLL_HOME/include/hcoll in hcoll version Y, but in
$HCOLL_HOME/include/hcoll/api in hcoll version Z, then the relative path
to $HCOLL_HOME/include cannot be hard coded.
anyway, let's assume it is ok to hard
On 08/11/2015 10:22 AM, Gilles Gouaillardet wrote:
i do not know the context, so i should not jump to any conclusion ...
if xxx.h is in $HCOLL_HOME/include/hcoll in hcoll version Y, but in
$HCOLL_HOME/include/hcoll/api in hcoll version Z, then the relative path
to $HCOLL_HOME/include cannot be
On Aug 11, 2015, at 1:39 AM, Åke Sandgren wrote:
>
> Please fix the hcoll test (and code) to be correct.
>
> Any configure test that adds /usr/lib and/or /usr/include to any compile
> flags is broken.
+1
Gilles filed https://github.com/open-mpi/ompi/pull/796; I just added some
comments to it
Hi!
In my current application, MPI_Send/MPI_Recv hangs when using buffers in
GPU device memory of a Nvidia GPU. I realized this is due to the fact
that OpenMPI uses the synchronous cuMempcy rather than the asynchornous
cuMemcpyAsync (see stacktrace at the bottom). However, in my
application,
"Lane, William" writes:
> I'm running a mixed cluster of Blades (HS21 and HS22 chassis), x3550-M3 and
> X3550-M4 systems, some of which support hyperthreading, while others
> don't (specifically the HS21 blades) all on CentOS 6.3 w/SGE.
Do you mean jobs are split across nodes which have hyperth
"Lane, William" writes:
> I read @
>
> https://www.open-mpi.org/faq/?category=sge
>
> that for OpenMPI Parallel Environments there's
> a special consideration for Son of Grid Engine:
>
>'"qsort_args" is necessary with the Son of Grid Engine distribution,
>version 8.1.1 and later, and prob
Ralph Castain writes:
> Hi Bill
>
> You need numactl-devel on the nodes. Not having them means we cannot ensure
> memory is bound local to the procs, which will hurt performance but not
> much else. There is an MCA param to turn off the warnings if you choose not
> to install the libs: hwloc_base
Because only the devel package includes the necessary pieces to set memory
affinity.
On Tue, Aug 11, 2015 at 9:37 AM, Dave Love wrote:
> Ralph Castain writes:
>
> > Hi Bill
> >
> > You need numactl-devel on the nodes. Not having them means we cannot
> ensure
> > memory is bound local to the pr
I think Dave's point is that numactl-devel (and numactl) is only needed for
*building* Open MPI. Users only need numactl to *run* Open MPI.
Specifically, numactl-devel contains the .h files we need to compile OMPI
against libnumactl:
$ rpm -ql numactl-devel
/usr/include/numa.h
/usr/include/num
Dear Users,
I have run into a problem with openmpi-1.8.7. It configures and
installs properly but when I tested it using examples it gave me numerous
errors with mpicc as shown in the output below. Have I made an error in the
process?
Amoss-MacBook-Pro:openmpi-1.8.7 amosleff$ cd examp
I talked with Jeremia off list and we figured out what was going on. There is
the ability to use the cuMemcpyAsync/cuStreamSynchronize rather than the
cuMemcpy but it was never made the default for Open MPI 1.8 series. So, to get
that behavior you need the following:
--mca mpi_common_cuda_cum
I have cloned Gilles' topic/hcoll_config branch and, after running
autogen.pl, have found that './configure --with-hcoll' does indeed work
now. I used Gilles' branch as I wasn't sure how best to get the pull
request changes in to my own clone of master. It looks like the proper
checks are happe
I can successfully run my OpenMPI 1.8.7 jobs outside of Son-of-Gridengine but
not via qrsh. We're
using CentOS 6.3 and a heterogeneous cluster of hyperthreaded and
non-hyperthreaded blades
and x3550 chassis. OpenMPI 1.8.7 has been built w/the debug switch as well.
Here's my latest errors:
qrsh -
14 matches
Mail list logo