Alright, everything is identical to Cielito but it looks like you are getting
bad data from alps.
I think we changed some of the alps parsing for 1.7.3. Can you give that
version a try and let me know if it resolves your issue. If not I can add
better debugging to the ras/alps module.
-Nathan
On
Here is what we can see:
knteran@mzlogin01e:~> ls -l /opt/cray/xe-sysroot
total 8
drwxr-xr-x 6 bin bin 4096 2012-02-04 11:05 4.0.36.securitypatch.20111221
drwxr-xr-x 6 bin bin 4096 2013-01-11 15:17 4.1.40
lrwxrwxrwx 1 root root6 2013-01-11 15:19 default -> 4.1.40
Thanks,
Keita
On 11/2
Hi,
Here is the output of "printenv | grep PBS". It seems that all variables
are set as I expected.
[mishima@manage mpi_demo]$ qsub -I -l nodes=1:ppn=32
qsub: waiting for job 8120.manage.cluster to start
qsub: job 8120.manage.cluster ready
[mishima@node03 ~]$ printenv | grep PBS
PBS_VERSION=TO
??? Alps reports that the two nodes each have one slot. What PE release
are you using. A quick way to find out is ls -l /opt/cray/xe-sysroot on the
external login node (this directory does not exist on the internal login nodes.)
-Nathan
On Tue, Nov 26, 2013 at 11:07:36PM +, Teranishi, Keita w
Nathan,
Here it is.
Keita
On 11/26/13 3:02 PM, "Nathan Hjelm" wrote:
>Ok, that sheds a little more light on the situation. For some reason it
>sees 2 nodes
>apparently with one slot each. One more set out outputs would be helpful.
>Please run
>with -mca ras_base_verbose 100 . That way I ca
Ok, that sheds a little more light on the situation. For some reason it sees 2
nodes
apparently with one slot each. One more set out outputs would be helpful.
Please run
with -mca ras_base_verbose 100 . That way I can see what was read from alps.
-Nathan
On Tue, Nov 26, 2013 at 10:14:11PM +
Nathan,
I am hoping these files would help you.
Thanks,
Keita
On 11/26/13 1:41 PM, "Nathan Hjelm" wrote:
>Well, no hints as to the error there. Looks identical to the output on my
>XE-6. How
>about setting -mca rmaps_base_verbose 100 . See what is going on with the
>mapper.
>
>-Nathan Hjelm
Well, no hints as to the error there. Looks identical to the output on my XE-6.
How
about setting -mca rmaps_base_verbose 100 . See what is going on with the
mapper.
-Nathan Hjelm
Application Readiness, HPC-5, LANL
On Tue, Nov 26, 2013 at 09:33:20PM +, Teranishi, Keita wrote:
> Nathan,
>
>
Nathan,
Please see the attached obtained from two cases (-np 2 and -np 4).
Thanks,
---
--
Keita Teranishi
Principal Member of Technical Staff
Scalable Modeling and Analysis Systems
Sandia National Laboratories
Livermore, CA 9
Seems like something is going wrong with processor binding. Can you run with
-mca plm_base_verbose 100 . Might shed some light on why it thinks there are
not enough slots.
-Nathan Hjelm
Application Readiness, HPC-5, LANL
On Tue, Nov 26, 2013 at 09:18:14PM +, Teranishi, Keita wrote:
> Nathan,
Nathan,
Now I remove strip_prefix stuff, which was applied to the other versions
of OpenMPI.
I still have the same problem with msubrun command.
knteran@mzlogin01:~> msub -lnodes=2:ppn=16 -I
qsub: waiting for job 7754058.sdb to start
qsub: job 7754058.sdb ready
knteran@mzlogin01:~> cd test-ope
Weird. That is the same configuration we have deployed on Cielito and Cielo.
Does
it work under an msub allocation?
BTW, with that configuration you should not set
plm_base_strip_prefix_from_node_names
to 0. That will confuse orte since the node hostname will not match what was
supplied by alps.
Nathan,
(Please forget about the segfault. It was my mistake).
I use OpenMPI-1.7.2 (build with gcc-4.7.2) to run the program. I used
contrib/platform/lanl/cray_xe6/optimized_lustre and
--enable-mpirun-prefix-by-default for configuration. As I said, it works
fine with aprun, but fails with mpirun
Here are the results of those two commands:
$ )mpic++ -show
g++ -I/Users/meredithk/tools/openmpi/include
-L/Users/meredithk/tools/openmpi/lib -lmpi_cxx -lmpi -lm
$ )otool -L /Users/meredithk/tools/openmpi/lib/libmpi_cxx.dylib
/Users/meredithk/tools/openmpi/lib/libmpi_cxx.dylib:
/Users/me
Hello,
Just like r29736, I believe that there are some missing tests in
ompi/mca/coll/libnbc/nbc_iscatterv.c and ompi/mca/coll/libnbc/nbc_igatherv.c
Thoughts ?
Pierre
Index: nbc_igatherv.c
===
--- nbc_igatherv.c (revision 29756)
Hi,
I used interactive mode just because it was easy to report the behavior.
I'm sure that submiting job gives the same result.
Therefore, I think the environment variables are also set in the session.
Anyway, I'm away from the cluster now. Regarding "$ env | grep PBS",
I'll send it later.
Reg
Hi,
Am 26.11.2013 um 01:22 schrieb tmish...@jcity.maeda.co.jp:
> Thank you very much for your quick response.
>
> I'm afraid to say that I found one more issuse...
>
> It's not so serious. Please check it when you have a lot of time.
>
> The problem is cpus-per-proc with -map-by option under T
17 matches
Mail list logo