Gus Correa <g...@ldeo.columbia.edu> writes: > On 03/27/2014 05:05 AM, Andreas Schäfer wrote: >>> >Queue systems won't allow resources to be oversubscribed.
[Maybe that meant that resource managers can, and typically do, prevent resources being oversubscribed.] >> I'm fairly confident that you can configure Slurm to oversubscribe >> nodes: just specify more cores for a node than are actually present. >> > > That is true. > If you lie to the queue system about your resources, > it will believe you and oversubscribe. For what it's worth, oversubscription might be overall or limited. We just had a user running some crazy Java program he refuses to explain submitted as a serial job running ~150 threads. The over-subscription was confined to core is used, and the effect on the 127 others was mostly due to the small overhead of the node daemon reading the crazy /proc smaps file to track the memory usage. The other cores were normally subscribed. Ob-OMPI: the other jobs may have been OMPI ones! > Torque has this same feature. > I don't know about SGE. > You may choose to set some or all nodes with more cores than they > actually have, if that is a good choice for the codes you run. > However, for our applications oversubscribing is bad, hence my mindset. Right. I don't think there's any question that it's a bad idea on a general purpose cluster running some OMPI jobs, for instance.