Joshua,
I am using a job scheduling system so ulimit –v is set by me. Nevertheless the
ulimit –l is always half the value of ulimit –v. This is a bit strange, I am
not sure whether this might be an issue (31GB and 156GB are decent values).
For completeness the output of ulimit –o from one of th
Aleksandar,
Please ensure your system administrator follows the guidelines outlined in
the link printed in the error message
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
Best,
Josh
On Fri, Jun 20, 2014 at 2:56 PM, Ivanov, Aleksandar (INR) <
aleksandar.iva...@kit.edu> wrot
Hi,
I was not the one updating the machine unfortunately, however I can ask my
colleagues for specific list of modifications done. If I understand you
correctly you are referring to the "ulimit" parameters. They are properly set,
in fact we use JMS as job scheduler, therefore the "ulimit -v" is
What was updated? If the OS, did you remember to set the memory registration
limits to max?
On Jun 20, 2014, at 11:25 AM, Ivanov, Aleksandar (INR)
wrote:
>
> Dear Sir or Madam,
>
> I am using the openmpi 1.6.5 library compiled with IFORT / ICC 13.1.5. Since
> a recent update of our machi
Dear Sir or Madam,
I am using the openmpi 1.6.5 library compiled with IFORT / ICC 13.1.5. Since a
recent update of our machine I started generating mpi errors. The code crashes
after completing approx. 24 % from the total job. The same code and input were
run before on the same machine and no
Put "orte_hetero_nodes=1" in your default MCA param file - uses can override by
setting that param to 0
On Jun 20, 2014, at 10:30 AM, Brock Palen wrote:
> Perfection! That appears to do it for our standard case.
>
> Now I know how to set MCA options by env var or config file. How can I make
Perfection! That appears to do it for our standard case.
Now I know how to set MCA options by env var or config file. How can I make
this the default, that then a user can override?
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
I think I begin to grok at least part of the problem. If you are assigning
different cpus on each node, then you'll need to tell us that by setting
--hetero-nodes otherwise we won't have any way to report that back to mpirun
for its binding calculation.
Otherwise, we expect that the cpuset of t
Extra data point if I do:
[brockp@nyx5508 34241]$ mpirun --report-bindings --bind-to core hostname
--
A request was made to bind to that would result in binding more
processes than cpus on a resource:
Bind to: CORE
I was able to produce it in my test.
orted affinity set by cpuset:
[root@nyx5874 ~]# hwloc-bind --get --pid 103645
0xc002
This mask (1, 14,15) which is across sockets, matches the cpu set setup by the
batch system.
[root@nyx5874 ~]# cat /dev/cpuset/torque/12719806.nyx.engin.umich.edu/cpus
Got it,
I have the input from the user and am testing it out.
It probably has less todo with torque and more cpuset's,
I'm working on producing it myself also.
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
On Jun 20, 2014, at
Thanks - I'm just trying to reproduce one problem case so I can look at it.
Given that I don't have access to a Torque machine, I need to "fake" it.
On Jun 20, 2014, at 9:15 AM, Brock Palen wrote:
> In this case they are a single socket, but as you can see they could be
> ether/or depending o
In this case they are a single socket, but as you can see they could be
ether/or depending on the job.
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
XSEDE Campus Champion
bro...@umich.edu
(734)936-1985
On Jun 19, 2014, at 2:44 PM, Ralph Castain wrote:
> Sorry, I should have been
13 matches
Mail list logo