> Am 15.08.2016 um 17:03 schrieb Ulrich Hiller <hil...@mpia-hd.mpg.de>: > > Hello, > > thank you for the clarification. I must have misunderstood you. > Now i did it.The master node was in the example i send now exec-node01 > (it varied from attempt to attempt). The output is in the master-node > file. The qstat file is the output of > qstat -g t -u '*' > That seems to look normal. > > Now i created a simple C file with an endless loop. > #include <stdio.h> > int main() > { > int x; > for(x=0;x=10;x=x+1) > { > puts("Hello"); > ; > } > return(0); > } > > and compiled it: > mpicc mpihello.c -o mpihello > and started qsub: > qsub -pe orte 300 -j yes -cwd -S /bin/bash <<< "mpiexec -n 300 mpihello" > The outputs look the same as for the sleep command above. > But now i counted the jobs: > > qstat -g t -u '*' | grep -ic slave > This results in the number '300', which i expected. > > On the execute nodes i did: > ps -ef | grep mpihello | grep -v grep | grep -vc mpiexec
f w/o - $ ps -e f will list a nice tree of the processes. > (i counted the 'mpihello' processes) > This is the result: > exec-node01: 43 > exec-node02: 82 > exec-node03: 83 > exec-node04: 82 > exec-node05: 82 > exec-node06: 80 > exec-node07: 64 > exec-node08: 64 To investigate this it would be good to post the complete slot allocation by `qstat -g t -u <your user>`, the master of the MPI application and one of the slave nodes' `ps -e f --cols=500`. Any "mpihello" in the path? -- Reuti > Which gives the sum of 580. > When i count the number of free solts together (from 'qhost -q') i also > get 300, which i expect. > Where do the extra processes on the nodes come from? > > This difference is reproducible. > > libgomp.so.1.0.0 library is installed, but aqpart from that nothing with > OpenMP. > > With kind regards, ulrich > > > > > > > > On 08/15/2016 02:30 PM, Ulrich Hiller wrote: >> Hello, >> >>> The other issue seems to be, that in fact your job is using only one >> machine, which means that it is essentially ignoring any granted slot >> allocation. While the job is running, can you please execute on the >> master node of the parallel job: >>> >>> $ ps -e f >>> >>> (f w/o -) and post the relevant lines belonging to either sge_execd or >> just running as kids of the init process, in case they jumped out of the >> process tree. Maybe a good start would be to execute something like >> `mpiexec sleep 300` in the jobscript. >>> >> >> i invoked >> qsub -pe orte 160 -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 sleep 300" >> >> the only line ('ps -e f') on the master node was: >> 55722 ? Sl 3:42 /opt/sge/bin/lx-amd64/sge_qmaster >> >> No other sge lines, no child processes from it, and no other init >> processes leading to sge While at the same time the sleep processes were >> running on the nodes (Checked with ps command on the nodes). >> >> The qstat command gave : >> 264 0.60500 STDIN ulrich r 08/15/2016 11:33:02 >> all.q@exec-node01 MASTER >> >> all.q@exec-node01 SLAVE >> >> all.q@exec-node01 SLAVE >> >> all.q@exec-node01 SLAVE >> [ ...] >> >> 264 0.60500 STDIN ulrich r 08/15/2016 11:33:02 >> all.q@exec-node03 SLAVE >> >> all.q@exec-node03 SLAVE >> >> all.q@exec-node03 SLAVE >> [ ... ] >> 264 0.60500 STDIN ulrich r 08/15/2016 11:33:02 >> all.q@exec-node05 SLAVE >> >> all.q@exec-node05 SLAVE >> [ ...] >> >> >> Because there was only the master deamon running on the master node, and >> you were tlaking about child processes: Was this now normal behaviour my >> cluster showed or is there something wrong? >> >> Kind reagrds, ulrich >> >> >> >> On 08/12/2016 07:11 PM, Reuti wrote: >>> Hi, >>> >>>> Am 12.08.2016 um 18:48 schrieb Ulrich Hiller <hil...@mpia-hd.mpg.de>: >>>> >>>> Hello, >>>> >>>> i have a strange effect, where i am not sure whether it is "only" a >>>> misconfiguration or a bug. >>>> >>>> First: I run son of gridengine 8.1.9-1.el6.x86_64 (i installed the rhel >>>> rpm on an opensuse 13.1 machine. This should not matter in this case, >>>> and it is reported to be able to run on opensuse). >>>> >>>> mpirun and mpiexec are from openmpi-1.10.3 (no other mpi was installed, >>>> neither on master, nor on slaves). The installation was made with: >>>> ./configure --prefix=`pwd`/build --disable-dlopen --disable-mca-dso >>>> --with-orte --with-sge --with-x --enable-mpi-thread-multiple >>>> --enable-orterun-prefix-by-default --enable-mpirun-prefix-by-default >>>> --enable-orte-static-ports --enable-mpi-cxx --enable-mpi-cxx-seek >>>> --enable-oshmem --enable-java --enable-mpi-java >>>> make >>>> make install >>>> >>>> I attached the outputs of 'qconf -ap all.q' , 'qconf -sconf' and 'qconf >>>> -sp orte' as textfiles. >>>> >>>> Now my problem: >>>> I asked for 20 cores and if i run qstat -u '*' it shows that this job >>>> is being run in slave07 using 20 cores but is not true! if i run qstat >>>> -f -u '*' i see that this job is only using 3 cores in salve07 and >>>> there are 17 cores in other nodes allocated to this job which are in fact >>>> unused! >>> >>> qstat will list only the master node of the parallel job and the number of >>> overall slots. The granted allocation you can check with: >>> >>> $ qstat -g t -u '*' >>> >>> The other issue seems to be, that in fact your job is using only one >>> machine, which means that it is essentially ignoring any granted slot >>> allocation. While the job is running, can you please execute on the master >>> node of the parallel job: >>> >>> $ ps -e f >>> >>> (f w/o -) and post the relevant lines belonging to either sge_execd or just >>> running as kids of the init process, in case they jumped out of the process >>> tree. Maybe a good start would be to execute something like `mpiexec sleep >>> 300` in the jobscript. >>> >>> Next step could be a `mpihello.c` where you put an almost endless loop >>> inside and switch off all optimizations during compilations to check >>> whether these slave processes are distributed in the correct way. >>> >>> Note that some applications will check the number of cores they are running >>> on and start by OpenMP (not Open MPI) as many threads as cores are found. >>> Could this be the case for your application too? >>> >>> -- Reuti >>> >>> >>>> Or other example: >>>> My job took say 6 cpus on slave07 and 14 on slave06 but nothing was >>>> running on 06 and therefore a waste of ressource on 06 and overload on >>>> 07 becomes highly possible (the numbers are made up). >>>> If i ran 1 Cpus in many independent jobs that would not be an issue, but >>>> imagine i now request 60 cpus on slave07, that would seriously overload >>>> the node in many cases. >>>> >>>> Or other example: >>>> if i ask for say 50 CPUs, the job will start on one node, e.g, >>>> slave01, but only reserving say 15 CPUs out of 64 and reserve the rest >>>> on many other nodes (obviously wasting space doing nothing). >>>> This has the bad consequence of allocating many more CPUs than available >>>> when many jobs are running, imagine you have 10 jobs like this one... >>>> some nodes will run maybe 3 even if they only have 24 CPUs... >>>> >>>> I hope that i have made clear what the issue is. >>>> >>>> I also see that the `qstat` and `qstat -f` are in disagreement. The >>>> latter is correct, i checked the processes running on the nodes. >>>> >>>> >>>> Did somebody already encounter such a problem? Does somebody have an >>>> idea where to look into or what to test? >>>> >>>> With kind regards, ulrich >>>> >>>> >>>> >>>> <qhost.txt><qconf-sconf.txt><qconf-mp-orte.txt><qconf-all.q>_______________________________________________ >>>> users mailing list >>>> users@gridengine.org >>>> https://gridengine.org/mailman/listinfo/users >>> > <qstat.txt><master-node.txt>_______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users