I don't know about your users, but experience has, unfortunately, taught
us to assume that users' jobs are very, very badly-behaved.
I choose to assume that it's incompetence on the part of programmers and
users, rather than malice, though. :-)
Lloyd Brown
Systems Admin
nMPI on the
non-IB node, without OFED installed, to no longer be able to figure out
that it shouldn't use the openib btl. Thus the reason why I ask for
more information about how that decision is being made. Maybe that will
clue me in, as to what changed.
Thanks,
--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
of this implies that the difference is related to something that
happened with librdmacm, not something that changed in OpenMPI. Sorry
for the list noise.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 03/02/2015 02:42 PM, Lloyd Brown wrote
, but I'm just stumped where to go
from here. I have some core files, but I'm having trouble getting the
symbols from the backtrace in gdb. Maybe I'm doing it wrong.
TIA,
--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
byufsl_debugging_segfault_on_resume.tar.gz
Description: application/gzip
right direction.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 12/29/2011 02:31 PM, Josh Hursey wrote:
> Often this type of problem is due to the 'prelink' option in Linux.
> BLCR has a FAQ item that discusses this issue a
how to do so, etc.). But if you're writing the application,
you're better off to handle it internally, than externally.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 01/19/2012 08:05 AM, Josh Hursey wrote:
> Currently
27;ve used this technique to play with ulimit sort of things in the
script before. I'm not entirely sure what variables are exposed to you
in the script, such that you could come up with a unique filename to
output to, though.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Y
reporting resources utilized, etc.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 02/29/2012 02:09 PM, Denver Smith wrote:
> Hello,
>
> On my cluster running moab and torque, I cannot ssh without a password
> between comput
1.4.5/lib.
You really need that to be in LD_LIBRARY_PATH (or some other method) on
all nodes, in all shells for the user. One simple way to do this is via
the startup files (eg. .bashrc and .bash_profile for bash, .cshrc for
csh/tcsh, etc.)
Lloyd Brown
Systems Administrator
Fulton Supercomputing La
o end up with at least 3 versions of v1.6 (gcc
compilers, intel compilers, pgi compilers) and possibly a few of a
previous version, so putting everything in /opt/openmpi/VERSION, is a
little problematic.
Thanks,
--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
build -bb path/to/openmpi-1.6.spec
In this case, the "" are all exactly the same. Clearly
there's something I'm missing about the RPM build process.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 06/26/2012 12:
as to where it's installed.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 06/27/2012 11:12 AM, Jeff Squyres wrote:
> On Jun 26, 2012, at 2:40 PM, Lloyd Brown wrote:
>
>> Is there an easy way with the .spec file and the
I'm not really familiar enough to know what you mean by "em slaves", but
for general testing of bandwidth and latency, I usually use the "OSU
Micro-benchmarks" (see http://mvapich.cse.ohio-state.edu/benchmarks/).
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
That's fine. In that case, you just compile it with your MPI
implementation and do something like this:
mpiexec -np 2 -H masterhostname,slavehostname ./osu_latency
There may be some all-to-all latency tools too. I don't really remember.
Lloyd Brown
Systems Administrator
Fulton Supe
t_lock(&_M_lock);
> ^
>
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h", line 375: error:
> identifier "omp_set_lock" is undefined
> omp_set_lock(&_M_lock);
> ^
>
> "/opt/pgi/linux86-64/12.5/include/CC/stl/_threads.h&q
ch compile just fine with 1.6.1.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 08/23/2012 04:43 PM, Jeff Squyres wrote:
> This was reported earlier today:
>
> https://svn.open-mpi.org/trac/ompi/ticket/3251
>
>
Thanks for getting this in so quickly.
Yes, the nightly tarball from Aug 25 (a1r27142), seems to get through a
configure and make stage at least.
Thanks,
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 08/25/2012 05:18 AM, Jeff
supply the number of nodes and nodefile,
like this:
NP=`wc -l $PBS_NODEFILE | awk '{print $1}'`
mpirun -n $NP -hostfile $PBS_NODEFILE myprogram
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 11/19/2012 03:28 PM, Mariana Var
and internal to your
application, choose the application-internal checkpointing.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 07/19/2013 01:34 PM, Erik Nelson wrote:
> I run mpi on an NSF computer. One of the conditions of use is
general pointers on
mpirun debugging flags to use. I can't find much in the docs yet on
run-time debugging for OpenMPI, as opposed to debugging the application.
Maybe I'm just looking in the wrong place.
Thanks,
--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
, does seem to work. I
admit I'm still curious to understand how to get OpenMPI to give me the
details of what's going on. But the immediate problem of getting the
numbers out of osu_bw and osu_latency, seems to be solved.
Thanks everyone. I really appreciate it.
--
Lloyd Brown
S
't have to exclude anything,
and it figures out to use em1, and not lo.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 09/20/2013 10:31 AM, Jeff Squyres (jsquyres) wrote:
> On Sep 20, 2013, at 12:27 PM, Lloyd Brown wrote:
of physical processors on the hosts. Whether
this works for you, depends on whether you want this type of
oversubscription to happen all the time, or on a per-job basis, etc.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On 11/22/2013 11:
into a situation where users have a combination of OpenMPI
and OpenMP threads, and the threads get constrained to the same
processor where the OpenMPI process was launched. As far as we can
tell, this started with v1.8.x.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young Un
n between processors (but within the cgroup) than
we would like, but that's still probably acceptable in this scenario.
If there's a better solution, we'd love to hear it.
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
On
25 matches
Mail list logo