Upgrading to SoGE 8.1.9 seems to have done the trick.

To summarize, this is what I did to get cgroups working with SoGE 8.1.9,
and Centos 7.x.

On each exec host:

systemctl enable cgconfig
systemctl enable cgred

Around line 423 of the sgeexecd startup script, add the
setup-cgroups-etc script:

      exec 1>/dev/null 2>&1
      $SGE_ROOT/util/resources/scripts/setup-cgroups-etc start
      $bin_dir/sge_execd

* Note that I initially tried to add the setup-cgroups-etc script after
the sge_execd startup but that didn't seem to work consistently.


On qmaster, "qconf -mconf" and set:

execd_params                 ENABLE_BINDING=true USE_CGROUPS


Change 2 lines in $SGE_ROOT/util/resources/scripts/setup-cgroups-etc:

< adminuser=$(awk '/^admin_user/ {print $2}'
$SGE_ROOT/$SGE_CELL/common/bootstrap)
---
> adminuser=sgeadmin


<             /bin/cp $cpuset_mnt/mems $cpuset_mnt/cpus $sge_cpuset
---
>             /bin/cp $cpuset_mnt/cpuset.mems $cpuset_mnt/cpuset.cpus
$sge_cpuset


Reboot qmaster and all exec nodes.

So far tests with single threaded, SMP, and OpenMPI v 1.10 seem to work
properly.

As a side note we also have munge security enabled on the Grid.


As a simple test, I submit this script:

echo "--------------"
echo "binding is" $SGE_BINDING
echo "--------------"
cat /proc/self/cgroup
echo "--------------"
cpucon=`grep cpuset /proc/self/cgroup | cut -f3 -d":"`
cat /dev/cpuset/$cpucon/cpuset.cpus
echo "--------------"


which returns:

qsub:

--------------
binding is 0
--------------
10:cpuset:/sge/230143.1/12159
9:net_cls:/
8:perf_event:/
7:blkio:/
6:devices:/
5:memory:/memlimit256
4:cpuacct,cpu:/
3:hugetlb:/
2:freezer:/
1:name=systemd:/system.slice/sgeexecd.hpc1.service
--------------
0
--------------


qsub -pe smp 8:

--------------
binding is 8 9 10 11 12 13 14 15
--------------
10:memory:/memlimit256
9:freezer:/
8:hugetlb:/
7:perf_event:/
6:cpuacct,cpu:/
5:devices:/
4:net_cls:/
3:blkio:/
2:cpuset:/sge/230148.1/13222
1:name=systemd:/system.slice/sgeexecd.hpc1.service
--------------
8-15
--------------


Thanks for the help William!

-Dj


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to