If anyone is willing to try this out, here are my patches: https://extranet.adestotech.com/soge-8.1.9-patch.tar
Let me know From: Daniel Povey <dpo...@gmail.com> Sent: Saturday, August 24, 2019 6:59 PM To: Ondrej Valousek <ondrej.valou...@adestotech.com> Cc: sge-disc...@liverpool.ac.uk Subject: Re: [SGE-discuss] SGE systemd integration - can I contribute my patches? When is the last anyone has heard from Dave Love? It's not clear to me that he (or anyone) is still maintaining GridEngine. I started a github with the intention of becoming the maintainer myself, here: https://github.com/son-of-gridengine/sge<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fson-of-gridengine%2Fsge&data=02%7C01%7Condrej.valousek%40adestotech.com%7Ce6e9e04ba0c74d0e607308d728b472f6%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C0%7C637022627818968712&sdata=Uj%2BsHDYektOU8Ci9k3qTenGQYVwbgg7JdqNLAh%2FGnWE%3D&reserved=0> but stopped after Dave Love objected. I also realized that I didn't have the time, and didn't fully understand a lot of aspects of e.g. the build and packaging process. I would be happy to give up control of that if someone else wants to maintain it. What would be the most ideal is to have a smooth and voluntary handover to a new generation, if someone can persuade Dave to do this it would be the best... perhaps have it be led by someone he trusts. Dan On Fri, Aug 23, 2019 at 5:40 AM Ondrej Valousek <ondrej.valou...@adestotech.com<mailto:ondrej.valou...@adestotech.com>> wrote: Hi, I just spent few days on this and have a working proposal. Functionality: 1. Execd_params can be now set to "USE_CGROUPS=systemd" 2. This causes shepherd to launch jobs via "systemd-run" scope units rather than execing them directly. 3. Jobs are no longer monitored via PDC (Sge process data collector) but via CGroup the system-run above creates. 4. Processes are no longer assigned additional GID (gid-range is ignored) as this is no longer needed 5. As we are using cgroups, all forked tasks are automatically killed once the job finishes (i.e. just like ENABLE_ADDGRP_KILL=true) Advantages: 1. 100% reliable process tracking via kernel's control groups 2. We can now use tools like systemd-cgls or systemd-cgtop to monitor job's performance real-time 3. Unlike classic systemd setup for exec daemon (when it runs in foreground), restarting sgeexecd service does not kill all jobs (because jobs run in a separate systemd scopes) 4. Should be backwards compatible (i.e. you just switch this on and see) with the traditional shepherd's functionality 5. Easy to implement Control Group limits for jobs (i.e. MemoryLimit, CPUshares, Memory reservation for job) Implementation details: Only two important functions were created: - start_command_via_systemd() which counterparts start_command() in shepherd - ptf_get_usage_from_systemd() which counterparts ptf_get_usage_from_data_collector() - minor fixes in other places - systemd does not seem to support setting CPUAffinity via control group, so the existing code for handling cpuset's is functional the same way it was. TODO: 1. I wanted to introduce some new complex attributes (like "cgroup_memmax" etc) that could be used to enforce cgroup limits, but it does not seem to be a trivial task unfortunately. If anyone knows how to do that it would be great as it would make my patches a lot more attractive. 2. Testing: so far, everything looks good, interactive and non-interactive jobs are working, env is passed, etc... Just need more testing. Now can I send my patches somewhere so it can be possibly merged with the SoGE main repo? Thanks, Ondrej From: Ondrej Valousek Sent: Friday, August 9, 2019 1:40 PM To: 'us...@gridengine.org<mailto:us...@gridengine.org>' <us...@gridengine.org<mailto:us...@gridengine.org><mailto:us...@gridengine.org<mailto:us...@gridengine.org>>> Subject: SGE & systemd integration Hi all, I am thinking of making SGE (or sge_execd) more systemd friendly. Right now, there is some (as per 8.1.9) support for cgroups as per: USE_CGROUPS=y/n My proposal is to make it: USE_CGROUPS=y/n/systemd when set to systemd, we would not to detect and any cgroups (and setting cpuset controller) manually. Instead, shepherd daemon would run the job via "systemd-run" binary. https://www.freedesktop.org/software/systemd/man/systemd-run.html<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd-run.html&data=02%7C01%7Condrej.valousek%40adestotech.com%7Ce6e9e04ba0c74d0e607308d728b472f6%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C0%7C637022627818978718&sdata=iy6M96Gfca8s24VEd5MS%2BNmpTFIaNBH%2Btn8fwCdCyRY%3D&reserved=0> systemd-run can set various cgroup controllers via it's "--property" flag, achieving the same we do now manually. We would probably also utilize the "-scope" flag to make the job running synchronously. Initially, I was thinking about implementing the same via "starter_method" flag, but systemd-run needs to be run as root, so it has to be hardcoded into shepherd.c and sge_execd daemon needs to also be running under root privileges, not sure if capabilities would help here. Does this initiative make any sense? I can try to implement it myself, but I am not familiar with sge internals. I can try... Ondrej _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk<mailto:SGE-discuss@liv.ac.uk> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Farc.liv.ac.uk%2Fmailman%2Flistinfo%2Fsge-discuss&data=02%7C01%7Condrej.valousek%40adestotech.com%7Ce6e9e04ba0c74d0e607308d728b472f6%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C0%7C637022627818978718&sdata=wWN6BoDsFYUk%2Bz%2FyLCYBY%2BF1HGLSWab8cPGYantgdqc%3D&reserved=0> _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss