On Mon, Mar 20, 2017 at 08:39:38PM +0000, juanesteban.jime...@mdc-berlin.de
wrote:
Hi folks,
I just ran into my first episode of the scheduler crashing because of too many
submitted jobs. It pegged memory usage to as much as I could give it (12gb at
one point) and still crashed while it tries to work its way through the stack.
How many is "too many?" We routinely have 50,000+ jobs, and there's
nary a blip in RAM usage on the qmaster. I'm not even sure that the
sge_qmaster process uses a Gig of RAM...
Just checked, with 3,000+ jobs in the queue, it's got 550MB RSS, and
a total of 2.3G of virtual memory (including a large mmap of
/usr/lib/locale/locale-archive).
I need to figure out how to size a box properly for a dedicated sge_master. How
do you folks recommend I do this?
12G should be plenty, IME. What version are you running, and what
spooling method are you using?
--
Jesse Becker (Contractor)
_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss