Ray I'm also on Ubuntu. I'll try the same test, but do it with and without swap on (e.g. by running the swapoff and swapon commands first). To complicate things I also don't know if the swapiness level makes a difference.
Thanks Ashton On Sun, Sep 23, 2018, 7:48 AM Raymond Wan <rwan.w...@gmail.com> wrote: > > Hi Chris, > > > On Sunday, September 23, 2018 09:34 AM, Chris Samuel wrote: > > On Saturday, 22 September 2018 4:19:09 PM AEST Raymond Wan wrote: > > > >> SLURM's ability to suspend jobs must be storing the state in a > >> location outside of this 512 GB. So, you're not helping this by > >> allocating more swap. > > > > I don't believe that's the case. My understanding is that in this mode > it's > > just sending processes SIGSTOP and then launching the incoming job so you > > should really have enough swap for the previous job to get swapped out > to in > > order to free up RAM for the incoming job. > > > Hmmmmmm, I'm way out of my comfort zone but I am curious > about what happens. Unfortunately, I don't think I'm able > to read kernel code, but someone here > ( > https://stackoverflow.com/questions/31946854/how-does-sigstop-work-in-linux-kernel) > > seems to suggest that SIGSTOP and SIGCONT moves a process > between the runnable and waiting queues. > > I'm not sure if I did the correct test, but I wrote a C > program that allocated a lot of memory: > > ----- > #include <stdlib.h> > > #define memsize 160000000 > > int main () { > char *foo = NULL; > > foo = (char *) malloc (sizeof (char) * memsize); > > for (int i = 0; i < memsize; i++) { > foo[i] = 0; > } > > do { > } while (1); > } > ----- > > Then, I ran it and sent a SIGSTOP to it. According to htop > (I don't know if it's correct), it seems to still be > occupying memory, but just not any CPU cycles. > > Perhaps I've done something wrong? I did read elsewhere > that how SIGSTOP is treated can vary from system to > system... I happen to be on an Ubuntu system. > > Ray > > > >