# qconf -sc
#name               shortcut   type      relop requestable consumable default  
urgency
#--------------------------------------------------------------------------------------
arch                a          STRING    ==    YES         NO         NONE     0
calendar            c          STRING    ==    YES         NO         NONE     0
cpu                 cpu        DOUBLE    >=    YES         NO         0        0
display_win_gui     dwg        BOOL      ==    YES         NO         0        0
h_core              h_core     MEMORY    <=    YES         NO         0        0
h_cpu               h_cpu      TIME      <=    YES         NO         0:0:0    0
h_data              h_data     MEMORY    <=    YES         NO         0        0
h_fsize             h_fsize    MEMORY    <=    YES         NO         0        0
h_rss               h_rss      MEMORY    <=    YES         NO         0        0
h_rt                h_rt       TIME      <=    YES         NO         0:0:0    0
h_stack             h_stack    MEMORY    <=    YES         NO         0        0
h_vmem              h_vmem     MEMORY    <=    YES         NO         0        0
hostname            h          HOST      ==    YES         NO         NONE     0
load_avg            la         DOUBLE    >=    NO          NO         0        0
load_long           ll         DOUBLE    >=    NO          NO         0        0
load_medium         lm         DOUBLE    >=    NO          NO         0        0
load_short          ls         DOUBLE    >=    NO          NO         0        0
m_core              core       INT       <=    YES         NO         0        0
m_socket            socket     INT       <=    YES         NO         0        0
m_thread            thread     INT       <=    YES         NO         0        0
m_topology          topo       STRING    ==    YES         NO         NONE     0
m_topology_inuse    utopo      STRING    ==    YES         NO         NONE     0
mem_free            mf         MEMORY    <=    YES         NO         0        0
mem_total           mt         MEMORY    <=    YES         NO         0        0
mem_used            mu         MEMORY    >=    YES         NO         0        0
min_cpu_interval    mci        TIME      <=    NO          NO         0:0:0    0
np_load_avg         nla        DOUBLE    >=    NO          NO         0        0
np_load_long        nll        DOUBLE    >=    NO          NO         0        0
np_load_medium      nlm        DOUBLE    >=    NO          NO         0        0
np_load_short       nls        DOUBLE    >=    NO          NO         0        0
num_proc            p          INT       ==    YES         NO         0        0
qname               q          STRING    ==    YES         NO         NONE     0
rerun               re         BOOL      ==    NO          NO         0        0
s_core              s_core     MEMORY    <=    YES         NO         0        0
s_cpu               s_cpu      TIME      <=    YES         NO         0:0:0    0
s_data              s_data     MEMORY    <=    YES         NO         0        0
s_fsize             s_fsize    MEMORY    <=    YES         NO         0        0
s_rss               s_rss      MEMORY    <=    YES         NO         0        0
s_rt                s_rt       TIME      <=    YES         NO         0:0:0    0
s_stack             s_stack    MEMORY    <=    YES         NO         0        0
s_vmem              s_vmem     MEMORY    <=    YES         NO         0        0
seq_no              seq        INT       ==    NO          NO         0        0
slots               s          INT       <=    YES         YES        1        
1000
swap_free           sf         MEMORY    <=    YES         NO         0        0
swap_rate           sr         MEMORY    >=    YES         NO         0        0
swap_rsvd           srsv       MEMORY    >=    YES         NO         0        0
swap_total          st         MEMORY    <=    YES         NO         0        0
swap_used           su         MEMORY    >=    YES         NO         0        0
tmpdir              tmp        STRING    ==    NO          NO         NONE     0
virtual_free        mem        MEMORY    <=    YES         YES        2        0
virtual_total       vt         MEMORY    <=    YES         NO         0        0
virtual_used        vu         MEMORY    >=    YES         NO         0        0


Best Regards
John Tai
Design Services
Semiconductor Manufacturing International (Shanghai) Corp.
Tel: 21-3861-0000 ext. 16116
E-Fax: 21-5080-4000 ext. 02906


-----Original Message-----
From: Reuti [mailto:re...@staff.uni-marburg.de]
Sent: Friday, December 16, 2016 4:22
To: John_Tai
Cc: Christopher Heiny; users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J]
Subject: Re: [gridengine users] John's cores pe (Was: users Digest...)


Am 16.12.2016 um 03:54 schrieb John_Tai:

> I have pinpointed the problem, but I don't know how to solve it.
>
> It looks like hosts with complex virtual_free cannot run jobs that require 
> PE, even though the virtual_free complex was not requested by the job. I set 
> the virtual_free to allow jobs to request RAM, so the goal is the each job to 
> request both RAM and number of cpu cores. Hopefully this helps figuring out a 
> solution. Thanks.

How does the definition of the complex look like in `qconf -sc`?

-- Reuti


> Here's an example of one host that doesn't work:
>
> # qconf -se ibm038
> hostname              ibm038
> load_scaling          NONE
> complex_values        virtual_free=16G
>
> # qsub -V -b y -cwd -now n -pe cores 7 -q all.q@ibm038 xclock Your job
> 143 ("xclock") has been submitted # qstat -j 143
> ==============================================================
> job_number:                 143
> exec_file:                  job_scripts/143
> submission_time:            Fri Dec 16 10:46:02 2016
> owner:                      johnt
> uid:                        162
> group:                      sa
> gid:                        4563
> sge_o_home:                 /home/johnt
> sge_o_log_name:             johnt
> sge_o_path:                 
> /home/sge/sge8.1.9-1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:.
> sge_o_shell:                /bin/tcsh
> sge_o_workdir:              /home/johnt/sge8
> sge_o_host:                 ibm005
> account:                    sge
> cwd:                        /home/johnt/sge8
> mail_list:                  johnt@ibm005
> notify:                     FALSE
> job_name:                   xclock
> jobshare:                   0
> hard_queue_list:            all.q@ibm038
> env_list:                   TERM=xterm,DISPLAY=dsls11:3.0,HOME= [..]
> script_file:                xclock
> parallel environment:  cores range: 7
> binding:                    NONE
> job_type:                   binary
> scheduling info:         cannot run in PE "cores" because it only offers 0 
> slots
>
> Here's an example of a host that does work:
>
> # qconf -se ibm037
> hostname              ibm037
> load_scaling          NONE
> complex_values        NONE
>
> # qsub -V -b y -cwd -now n -pe cores 7 -q all.q@ibm037 xclock Your job
> 144 ("xclock") has been submitted # qstat -j 144
> ==============================================================
> job_number:                 144
> exec_file:                  job_scripts/144
> submission_time:            Fri Dec 16 10:49:35 2016
> owner:                      johnt
> uid:                        162
> group:                      sa
> gid:                        4563
> sge_o_home:                 /home/johnt
> sge_o_log_name:             johnt
> sge_o_path:                 
> /home/sge/sge8.1.9-1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:.
> sge_o_shell:                /bin/tcsh
> sge_o_workdir:              /home/johnt/sge8
> sge_o_host:                 ibm005
> account:                    sge
> cwd:                        /home/johnt/sge8
> mail_list:                  johnt@ibm005
> notify:                     FALSE
> job_name:                   xclock
> jobshare:                   0
> hard_queue_list:            all.q@ibm037
> env_list:                   TERM=xterm,DISPLAY=dsls11:3.0,HOME=/home/johnt 
> [..]
> script_file:                xclock
> parallel environment:  cores range: 7
> binding:                    NONE
> job_type:                   binary
> usage         1:            cpu=00:00:00, mem=0.00000 GB s, io=0.00000 GB, 
> vmem=N/A, maxvmem=N/A
> binding       1:            NONE
>
>
>
> From: users-boun...@gridengine.org
> [mailto:users-boun...@gridengine.org] On Behalf Of John_Tai
> Sent: Wednesday, December 14, 2016 3:52
> To: Christopher Heiny
> Cc: users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J]
> Subject: Re: [gridengine users] John's cores pe (Was: users Digest...)
>
> I'm actually using sge8.1.9-1 for all. Is there a problem with that? 
> Downloaded here:
>
> http://arc.liv.ac.uk/downloads/SGE/releases/8.1.9/
>
>
>
>
>
> From: Christopher Heiny [mailto:christopherhe...@gmail.com]
> Sent: Wednesday, December 14, 2016 3:26
> To: John_Tai
> Cc: users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J]; Reuti;
> Christopher Heiny
> Subject: Re: [gridengine users] John's cores pe (Was: users Digest...)
>
>
>
> On Dec 13, 2016 7:04 PM, "John_Tai" <john_...@smics.com> wrote:
> I have 3 hosts in all.q, it seems the 2 servers running RHEL5.3 (ibm037, 
> ibm038) do not work with PE, while the server with RHEL6.8 (ibm021) is 
> working ok. Their conf are identical:
>
>
> Hmmmm. Might be a Grid Engine version mismatch issue. If you installed from 
> RH rpms, then I think EL5.3 is on 6.1u4 and EL6.8 is on 6.2u3 or 6.2u5.
>
>
>
>
>
> # qconf -sq all.q@ibm038
> qname                 all.q
> hostname              ibm038
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              0
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               cores
> rerun                 FALSE
> slots                 8
> tmpdir                /tmp
> shell                 /bin/sh
> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            NONE
> xuser_lists           NONE
> subordinate_list      NONE
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         default
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 INFINITY
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                INFINITY
>
>
>
>
>
> -----Original Message-----
> From: Christopher Heiny [mailto:che...@synaptics.com]
> Sent: Wednesday, December 14, 2016 10:21
> To: John_Tai; Reuti
> Cc: Coleman, Marcus [JRDUS Non-J&J]; users@gridengine.org
> Subject: Re: John's cores pe (Was: users Digest...)
>
> On Wed, 2016-12-14 at 02:03 +0000, John_Tai wrote:
> > I switched schedd_job_info to true, these are the outputs you
> > requested:
> >
> >
> >
> > # qstat -j 95
> > ==============================================================
> > job_number:                 95
> > exec_file:                  job_scripts/95
> > submission_time:            Tue Dec 13 08:50:34 2016
> > owner:                      johnt
> > uid:                        162
> > group:                      sa
> > gid:                        4563
> > sge_o_home:                 /home/johnt
> > sge_o_log_name:             johnt
> > sge_o_path:                 /home/sge/sge8.1.9-
> > 1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-
> > amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:.
> > sge_o_shell:                /bin/tcsh
> > sge_o_workdir:              /home/johnt/sge8
> > sge_o_host:                 ibm005
> > account:                    sge
> > cwd:                        /home/johnt/sge8
> > mail_list:                  johnt@ibm005
> > notify:                     FALSE
> > job_name:                   xclock
> > jobshare:                   0
> > hard_queue_list:            all.q@ibm038
> > env_list:                   TERM=xterm,DISPLAY=dsls11:3.0,HOME=/home/
> > johnt,SHELL=/bin/tcsh,USER=johnt,LOGNAME=johnt,PATH=/home/sge/sge8.1.
> > 9-1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-
> > amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:.,
> > H
> > OSTTYPE=x86_64-
> > linux,VENDOR=unknown,OSTYPE=linux,MACHTYPE=x86_64,SHLVL=1,PWD=/home/
> > j
> > ohnt/sge8,GROUP=sa,HOST=ibm005,REMOTEHOST=dsls11,MAIL=/var/spool/mai
> > l
> > /johnt,LS_COLORS=no=00:fi=00:di=00;36:ln=00;34:pi=40;33:so=01;31:bd=
> > 4
> > 0;33:cd=40;33:or=40;31:ex=00;31:*.tar=00;33:*.tgz=00;33:*.zip=00;33:
> > *
> > .bz2=00;33:*.z=00;33:*.Z=00;33:*.gz=00;33:*.ev=00;41,G_BROKEN_FILENA
> > M
> > ES=1,SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-
> > askpass,KDE_IS_PRELINKED=1,KDEDIR=/usr,LANG=en_US.UTF-
> > 8,LESSOPEN=|/usr/bin/lesspipe.sh
> > %s,HOSTNAME=ibm005,INPUTRC=/etc/inputrc,ASSURA_AUTO_64BIT=NONE,EDITO
> > R
> > =vi,TOP=-ores
> > 60,CVSROOT=/home/edamgr/CVSTF,OPERA_PLUGIN_PATH=/usr/java/jre1.5.0_0
> > 1
> > /plugin/i386/ns7,NPX_PLUGIN_PATH=/usr/java/jre1.5.0_01/plugin/i386/n
> > s
> > 7,MANPATH=/home/sge/sg!
> >  e8.1.9-
> > 1.el5/man:/usr/share/man:/usr/X11R6/man:/usr/kerberos/man,LD_LIBRARY
> > _
> > PATH=/usr/lib:/usr/local/lib:/usr/lib64:/usr/local/lib64,MGC_HOME=/h
> > o
> > me/eda/mentor/aoi_cal_2015.3_25.16,CALIBRE_LM_LOG_LEVEL=WARN,MGLS_LI
> > C
> > ENSE_FILE=1717@ibm004:1717@ibm005:1717@ibm041:1717@ibm042:1717@ibm04
> > 3
> > :1717@ibm033:1717@ibm044:1717@td156:1717@td158:1717@ATD222,MGC_CALGU
> > I
> > _RELEASE_LICENSE_TIME=0.5,MGC_RVE_RELEASE_LICENSE_TIME=0.5,SOSCAD=/c
> > a
> > d,EDA_TOOL_SETUP_ROOT=/cad/toolSetup,EDA_TOOL_SETUP_VERSION=1.0,SGE_
> > R
> > OOT=/home/sge/sge8.1.9-1.el5,SGE_ARCH=lx-
> > amd64,SGE_CELL=cell2,SGE_CLUSTER_NAME=p6444,SGE_QMASTER_PORT=6444,SG
> > E
> > _EXECD_PORT=6445,DRMAA_LIBRARY_PATH=/home/sge/sge8.1.9-
> > 1.el5/lib//libdrmaa.so
> > script_file:                xclock
> > parallel environment:  cores range: 1
> > binding:                    NONE
> > job_type:                   binary
> > scheduling info:            cannot run in queue "pc.q" because it is
> > not contained in its hard queue list (-q)
> >                             cannot run in queue "sim.q" because it
> > is not contained in its hard queue list (-q)
> >                             cannot run in queue "all.q@ibm021"
> > because it is not contained in its hard queue list (-q)
> >                             cannot run in PE "cores" because it only
> > offers 0 slots
>
> Hmmmm.  Just a wild idea, but I'm thinking maybe there's something wacky 
> about ibm038's particular configuration.  What does
>     qconf -sq all.q@ibm038
> say?
>
> And what happens if you use this qsub command?
>     qsub -V -b y -cwd -now n -pe cores 2 -q all.q xclock
>
>                                         Cheers,
>                                                 Chris
>
>
> ________________________________
>
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
>
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
>
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
>
> This email (including its attachments, if any) may be confidential and 
> proprietary information of SMIC, and intended only for the use of the named 
> recipient(s) above. Any unauthorized use or disclosure of this email is 
> strictly prohibited. If you are not the intended recipient(s), please notify 
> the sender immediately and delete this email from your computer.
>
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users

________________________________

This email (including its attachments, if any) may be confidential and 
proprietary information of SMIC, and intended only for the use of the named 
recipient(s) above. Any unauthorized use or disclosure of this email is 
strictly prohibited. If you are not the intended recipient(s), please notify 
the sender immediately and delete this email from your computer.

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to