since it’s just a pre-built bundle from Mellanox it’s not something I can test
easily.
Otherwise, the solution we use is to just LD_PRELOAD=libmpi.so when launching
Python so that it gets loaded into the global namespace like would happen with
a “normal” compiled program.
Cheers,
Ben
> O
k 4 on the first socket of the
first node even through there’s no free cores left there (because of the PE=4),
instead of moving to the next node. But we’d still need to use the --rank-by
option in this case, anyway.
Cheers,
Ben
___
users mailing list
Hi,
A couple of our users have reported issues using UCX in OpenMPI 3.1.2. It’s
failing with this message:
[r1071:27563:0:27563] rc_verbs_iface.c:63 FATAL: send completion with error:
local protection error
The actual MPI calls provoking this are different between the two applications
— one
0x014f46e7 onetep() /short/z00/aab900/onetep/src/onetep.F90:277
23 0x0041465e main() ???:0
24 0x0001ed1d __libc_start_main() ???:0
25 0x00414569 _start() ???:0
===
> On 12 Jul 2018, at 1:36 pm, Ben Menadue wrote:
>
> Hi,
>
> Perha
Hi,
Perhaps related — we’re seeing this one with 3.1.1. I’ll see if I can get the
application run against our --enable-debug build.
Cheers,
Ben
[raijin7:1943 :0:1943] Caught signal 11 (Segmentation fault: address not mapped
to object at address 0x45)
/short/z00/bjm900/build/openmpi-mofed4.2
closeCluster, and it’s here that it
hung.
Ralph suggested trying master, but I haven’t had a chance to try this yet. I’ll
try it today and see if it works for me now.
Cheers,
Ben
> On 5 Jun 2018, at 6:28 am, r...@open-mpi.org wrote:
>
> Yes, that does sound like a bug - the #connects must
output from the first, you just didn’t copy enough decimal places
:-) .
Cheers,
Ben
> On 23 May 2018, at 8:38 am, Konstantinos Konstantinidis
> wrote:
>
> Thanks Jeff.
>
> I ran your code and saw your point. Based on that, it seems that my
> comparison by just pr
take to debug it, but I have builds of the MPI libraries with --enable-debug available if needed.Cheers,Ben
Rmpi_test.r
Description: Binary data
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
for me to follow up with them.
Cheers,
Ben
> On 6 Apr 2018, at 2:48 am, Nathan Hjelm wrote:
>
>
> Honestly, this is a configuration issue with the openib btl. There is no
> reason to keep either eager RDMA nor is there a reason to pipeline RDMA. I
> haven't found
(MB/s)
2097152 11397.85
4194304 11389.64
This makes me think something odd is going on in the RDMA pipeline.
Cheers,
Ben
> On 5 Apr 2018, at 5:03 pm, Ben Menadue wrote:
>
> Hi,
>
> We’ve just been running some OSU benchmarks with OpenMPI 3.0.0 and
has noticed anything similar, and if this is
unexpected, if anyone has a suggestion on how to investigate further?
Thanks,
Ben
Here’s are the numbers:
3.0.0, osu_bw, default settings
> mpirun -map-by ppr:1:node -np 2 -H r6,r7 ./osu_bw
# OSU MPI Bandwidth Test v5.4.0
# Size Bandwidth (M
a, it works fine, but it would still be good to get it working using
the “standard” communication path, without needing the accelerators.
I was wondering if anyone seen this before, and if anyone had any suggestions
for how to proceed?
Thanks,
Ben
___
Hi,
Sorry to reply to an old thread, but we’re seeing this message with 2.1.0 built
against CUDA 8.0. We're using libcuda.so.375.39. Has anyone had any luck
suppressing these messages?
Thanks,
Ben
> On 27 Mar 2017, at 7:13 pm, Roland Fehrenbacher wrote:
>
>>>>>
nice+yield at 1.5%, and the timing for spin was identical to
when it was on its own.
The only problem I can see is that the timing for nice+yield increased
dramatically — to 187,600 ns per iteration! Is this too long for a yield?
Cheers,
Ben
Hi,
> On 26 Mar 2017, at 1:13 am, Jeff Squyres (jsquyres)
> wrote:
> Here's an old post on this list where I cited a paper from the Intel
> Technology Journal.
Thanks for that link! I need to go through it in detail, but this paragraph did
jump out at me:
On a processor with Hyper-Threading
> the way how you oversubscribe.
>
> +1
+2
As always, experiment to find the best for your hardware and jobs.
Cheers,
Ben
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
– which for stack-allocated variables could easily be a
valid handle from a previous invocation of that function…
Ben
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Jeff Hammond
Sent: Monday, 22 August 2016 1:21 PM
To: Open MPI Users
Subject: Re: [OMPI users] mpi_f08 Question
* variables in Fortran; even
integers and reals have undefined value until they’re first stored to. This can
be quite annoying as specifying the initial value when declaring them also
gives them the SAVE attribute…
Cheers,
Ben
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Matt
Hi Gilles,
Ah, of course - I forgot about that.
Thanks,
Ben
-Original Message-
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gilles
Gouaillardet
Sent: Tuesday, 16 August 2016 4:07 PM
To: Open MPI Users
Subject: Re: [OMPI users] Mapping by hwthreads without fully
uding KNL (where SMT is important to get good performance in most
cases).
Thanks,
Ben
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
.c:13
But since this is in mpirun itself, I'm not sure how to delve deeper - is
there an MCA *_base_verbose parameter (or equivalent) that works on the
mpirun?
Cheers,
Ben
Hi Gilles,
Wow, thanks - that was quick. I'm rebuilding now.
Cheers,
Ben
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles
Gouaillardet
Sent: Friday, 29 January 2016 1:54 PM
To: Open MPI Users
Subject: Re: [OMPI users] Any changes to rma
ardet
Sent: Friday, 29 January 2016 1:33 PM
To: Open MPI Users
Subject: Re: [OMPI users] Any changes to rmaps in 1.10.2?
I was able to reproduce the issue on one node with a cpuset manually set.
fwiw, i cannot reproduce the issue using taskset instead of cpuset (!)
Cheers,
Gilles
On 1/29/2016 11:
echo 0-31 > cpuset.cpus
13:03 bjm900@r60 ~ > cat /cgroup/cpuset/pbspro/4363542.r-man2/cpuset.cpus
0-31
13:04 bjm900@r60 ~ > /apps/openmpi/1.10.2/bin/mpirun hostname
<...hostnames...>
Cheers,
Ben
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Be
physical
core (i.e. ask for hyperthreading on job submission), then it runs fine
under 1.10.2.
Cheers,
Ben
-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles
Gouaillardet
Sent: Friday, 29 January 2016 11:07 AM
To: Open MPI Users
Subject: Re: [OMPI users] An
get rid
of the policy options as above, I get the original error.
However, if I do it outside of a PBS job (so no cgroup), it works as I would
expect. So have there been any changes in the handling of cpusets?
Cheers,
Ben
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of
d the cpuset definitely has all
cores available:
$ cat /cgroup/cpuset/pbspro/4347646.r-man2/cpuset.cpus
0-15
Is there something here I'm missing?
Cheers,
Ben
it warned about an implicit declaration of this symbol
during the builds:
../../../../../../../../ompi/mca/coll/hcoll/coll_hcoll_module.c:263: warning:
implicit declaration of function 'hcoll_check_mem_release_cb_needed'
Cheers,
Ben
From: users [mailto:users-boun...@ope
has no member named 'coll_igatherv'
It looks like PR567 in the 1.10 branch that added the new references.
Cheers,
Ben
case a new
approach might be needed in PETSc. Otherwise, maybe a per-attribute lock is
needed in OpenMPI - but I'm not sure whether the get in the callback is on
the same attribute as is being deleted.
Thanks,
Ben
#0 0x7fd7d5de4264 in __lll_lock_wait () from /lib64/lib
I know why it quite - M3EXIT was called - but thanks for looking.
On Wed, May 21, 2014 at 4:02 PM, Gus Correa wrote:
> Hi Ben
>
> One of the ranks (52) called MPI_Abort.
> This may be a bug in the code, or a problem with the setup
> (e.g. a missing or incorrect input file).
>
Unknown Unknown
CCTM_V5g_Linux2_x 007FD3A0 Unknown Unknown Unknown
CCTM_V5g_Linux2_x 007BA9A2 Unknown Unknown Unknown
CCTM_V5g_Linux2_x 00759288 Unknown Unknown Unknown
...
On Wed, May 21, 2014 at 2:08 PM, Gus Correa wro
21, 2014 at 1:34 PM, Douglas L Reeder wrote:
> Ben,
>
> The netcdf/4.1.3 module maybe loading the openmpi/1.4.4 module. Can you do
> module show the netcdf module file to to see if there is a module load
> openmpi command.
>
> Doug Reeder
>
> On May 21, 2014, at 12:23 PM,
Lua modules and is
> actively developped and debugged. There are litteraly new features every
> month or so. If it does not do what you want, odds are that the developper
> will add it shortly (I've had it happen).
> >>
> >> Maxime
> >>
> >> Le 2014-05-16
:
1) intel/2013.1.039 2) python3/3.2.1 3) pgi/11.7
4) openmpi/1.4.4-intel 5) netcdf/4.1.3
[bl10@login2 ~]$
On Fri, May 16, 2014 at 5:46 PM, Gus Correa wrote:
> On 05/16/2014 06:26 PM, Ben Lash wrote:
>
>> I'm not sure I have the ability to implement a differen
t works like a charm, understand both TCL and Lua modules and is actively
> developped and debugged. There are litteraly new features every month or
> so. If it does not do what you want, odds are that the developper will add
> it shortly (I've had it happen).
>
> Maxime
>
&
el/,
which is in the software I'm trying to recompile's lib folder
(/home/bl10/CMAQv5.0.1/lib/x86_64/ifort). Thanks for any ideas. I also
tried changing $pkgdatadir based on what I read here:
http://www.open-mpi.org/faq/?category=mpi-apps#default-wrapper-compiler-flags
Thanks.
--Ben L
e any suggestions on how to perhaps tweak the settings to
help with memory use.
--
Ben Auer, PhD SSAI, Scientific Programmer/Analyst
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-286-9176 Fax: 301-614-6246
with openmpi and pgi 10.9?
thanks,
Ben
Good idea. I have tried it twice.
On Sep 4, 2009, at 9:33 AM, Gus Correa wrote:
Hi Ben
My recollection is that similar problems were reported here
when there was some residual of a previous build
(perhaps with gfortran), which was not completely
cleaned up, when the current build was
I have received two private emails saying to check my PATH and
LD_LIBRARY_PATH.
The path is OK and I am using the full pathname to make sure I get the
right exe. I also checked the LD_LIBRARY_PATH and that appears to be OK.
On Sep 4, 2009, at 7:28 AM, Ben Mayer wrote:
I am using PGI 9.0
I am using PGI 9.0-1 to compile OpenMPI 1.3.3. I use the following
command to configure OpenMPI:
./configure CC=pgcc CXX=pgCC FC=pgf90 F90=pgf90 --prefix=/shared/ben/
openmpi-1.3.3
The PGI compilers are in the path. The make and make install complete
successfully. The problem that I am
I get to "make install" and then it complains about icc not being
found and libopen-rte.la needing to be relinked.
Any help would be appreciated.
Linux version
cat /proc/version
Linux version 2.6.27.7-9-pae (geeko@buildhost) (gcc version 4.3.2
[gcc-4_3-branch revision 141291] (SUSE Linux) ) #1 SM
. I will try np -1.
Ben
> Date: Sat, 14 Mar 2009 18:07:32 +0900
> From: r...@kuicr.kyoto-u.ac.jp
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Compiling ompi for use on another machine
>
>
> Hi Ben,
>
>
> ben rodriguez wrote:
> > I have compiled
host: adduct1
Configured by: root
Configured on: Tue Mar 10 17:57:14 PDT 2009
Configure host: adduct1
Built by: Ben
Built on: Tue Mar 10 18:11:01 PDT 2009
Built host: adduct1
C bindings: yes
C
work (that are not part of the cluster) to see the
PXE master and then boot to that.
I have enabled PXE in BIOS.
Thanks,
Ben
On Thu, Aug 7, 2008 at 2:03 PM, Tim Mattox wrote:
> I think a better approach than using NFS-root or LiveCDs is to use Perceus in
> this situation, since it has bee
ClusterKnoppix: OpenMOSIX (no MPI)
http://clusterknoppix.sw.be/
CHAOS: OpenMOSIX (no MPI)
Pai Pix: could not find on internet
Thanks for your help,
Ben
*all* the #defines; that's what's used to compile the
> OMPI code base. mpi.h replicates a small number of these defines that
> are used by OMPI's public interface.
I will think about this guidance and see what kind of patches and
alternative patches I can suggest.
I did not detect autoheader being used in the process of building
mpi.h; is that correct? it would make some simpler workarounds easier.
Ben
rs, I can post a patch
somewhere to address this. It appears that of these, only
sizeof_int affects more than a few source files.
thanks,
Ben Allan
s almost certain to be slower.
Where things get interesting (and encouraging) is if you increase
the total data being processed (hold data quantity per node constant).
ben allan
On Thu, Jun 07, 2007 at 08:24:03PM -0400, Aaron Thompson wrote:
> Hello,
> Does anyone have experience usin
bin doesn't
end up in the regular PATH.
Ben
http://www.llnl.gov/CASC/components/docs/babel-1.1.0.tar.gz
Ben
A build-related questions about 1.1.4
Is parallel make usage (make -j 8) supported (at least if make is gnu?).
thanks,
Ben
where misplaced fortran
compiler option might make that true?
Due to an automated code generator in the processing (babel)
I have to pick one of INTEGER*4 or INTEGER*8 and stick to it.
I'm guessing INTEGER*4 would be a poor choice for MPI opaque
objects in calling on some MPI implementations.
Ben
I'm dealing with mixed language requirements, the babel interoperability
tool from LLNL, gcj and whatever other mpis or javas I may have to resort
to. Good to hear others more or less hit the same issues with
mpi-java prototypes that are published.
Ben
On Fri, Jul 15, 2005 at 04:04:27PM
might present a
chance to create a defacto standard in preparation for
extending the standard to include java. Clearly not
high on everybody's list of favorite things to think about.
Ben
On Fri, Jul 15, 2005 at 11:18:31AM -0400, Jeff Squyres wrote:
> On Jul 15, 2005, at 10:55 AM, Be
lurking around
mpi datatypes...
Anyone?
thanks,
Ben
, mpich2
http://www-unix.mcs.anl.gov/mpi/mpich2/
is open for business.
Ben
On Mon, Jul 04, 2005 at 11:33:45AM +0300, Koray Berk wrote:
> Hello,
> This is Koray Berk, from Istanbul Technical University.
> We have a high performance computing lab, with diverse platforms and
> therefore i
g such examples is a wiki, but in the source
is good too.
Binary rpms should be the responsibility of the distribution
makers (redhat, whoever else) not developers.
Ben
On Thu, Jun 16, 2005 at 09:01:41PM -0400, Jeff Squyres wrote:
> I have some random user questions about RPMs, though:
>
logic
that is the autotool conventions.
thanks,
ben
On Thu, Jun 16, 2005 at 08:44:48PM -0400, Jeff Squyres wrote:
>
> The default build is to make libmpi be a shared library and build all
> the components as dynamic shared objects (think "plugins").
>
> But we currentl
On Thu, Jun 16, 2005 at 06:33:51PM -0400, Jeff Squyres wrote:
> On Jun 16, 2005, at 2:58 PM, Ben Allan wrote:
>
> The only reason to have something like ompiConf.sh is to use the
> frameworks that already exist (like the gnome-conf thingy). I was only
> tossing that out as
n ompi_info command
> (analogous to, but greatly superseding LAM's laminfo command).
I'm looking forward to seeing how well the wrappers interact
with babel+libtool.
Ben
On Wed, Jun 15, 2005 at 08:27:58PM -0400, Jeff Squyres wrote:
> On Jun 15, 2005, at 7:02 PM, Ben Allan wrote:
>
> Ah -- I thought that that would be a different issue (I presume you're
> speaking of the compile/lib flags command, like gnome-config et
> al.)...? Are you say
ew compilers.
I can't count the number of times i've "debugged" some user
trying to compile c++ code with a mismatched mpic[xx,++] wrapper.
Please, extract the full path name to the compilers your
wrappers are going to invoke and put them in ompi_info.
thanks,
(an incremen
mething else, could
> you explain what you mean?
Just that precisely (oops, oh, there they are indeed).
Only if you do decide to move toward
scons or something and away from autotools, I would encourage you
to consider keeping libtool in the mix.
thanks
ben
65 matches
Mail list logo