We have found that the behavior that multiples consumable memory resource 
requests by number of pe slots can be confusing (and requires extra math in 
automation scripts), so we’ve have the complex consumable value set to “JOB” 
rather than “YES”. When this is done (at least on SoGE), the memory requested 
is NOT multiplied by the number of slots. We also use h_vmem rather than 
virtual_free.

Best,
Chris

On 12/20/16, 5:11 AM, "users-boun...@gridengine.org on behalf of Reuti" 
<users-boun...@gridengine.org on behalf of re...@staff.uni-marburg.de> wrote:


    > Am 20.12.2016 um 02:45 schrieb John_Tai <john_...@smics.com>:
    >
    > I spoke too soon. I can request PE and virtual_free separately, but I 
cannot request both:
    >
    >
    >
    > # qsub -V -b y -cwd -now n -pe cores 7 -l mem=10G -q all.q@ibm037 xclock

    Above you request "mem" (which is a snapshot of the actual usage and may 
vary over the runtime of other jobs [unless they request the total amount 
already at the beginning of the job and stay with it]).

    > Your job 180 ("xclock") has been submitted
    > # qstat
    > job-ID  prior   name       user         state submit/start at     queue   
                       slots ja-task-ID
    > 
-----------------------------------------------------------------------------------------------------------------
    >    180 0.55500 xclock     johnt        qw    12/20/2016 09:43:41          
                          7
    > # qstat -j 180
    > ==============================================================
    > job_number:                 180
    > exec_file:                  job_scripts/180
    > submission_time:            Tue Dec 20 09:43:41 2016
    > owner:                      johnt
    > uid:                        162
    > group:                      sa
    > gid:                        4563
    > sge_o_home:                 /home/johnt
    > sge_o_log_name:             johnt
    > sge_o_path:                 
/home/sge/sge8.1.9-1.el5/bin:/home/sge/sge8.1.9-1.el5/bin/lx-amd64:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/home/johnt/bin:.
    > sge_o_shell:                /bin/tcsh
    > sge_o_workdir:              /home/johnt/sge8
    > sge_o_host:                 ibm005
    > account:                    sge
    > cwd:                        /home/johnt/sge8
    > hard resource_list:         virtual_free=10G

    10G times 7 = 70 GB

    The node has this amount of memory installed and it is defined this way in 
`qconf -me ibm037`?

    -- Reuti


    > mail_list:                  johnt@ibm005
    > notify:                     FALSE
    > job_name:                   xclock
    > jobshare:                   0
    > hard_queue_list:            all.q@ibm037
    > env_list:                   TERM=xterm,DISPLAY=dsls11:3. [..]
    > script_file:                xclock
    > parallel environment:  cores range: 7
    > binding:                    NONE
    > job_type:                   binary
    > scheduling info:            cannot run in queue "sim.q" because it is not 
contained in its hard queue list (-q)
    >                            cannot run in queue "pc.q" because it is not 
contained in its hard queue list (-q)
    >                            cannot run in PE "cores" because it only 
offers 0 slots
    >
    >
    >
    >
    >
    > -----Original Message-----
    > From: Reuti [mailto:re...@staff.uni-marburg.de]
    > Sent: Saturday, December 17, 2016 10:16
    > To: Reuti
    > Cc: John_Tai; users@gridengine.org; Coleman, Marcus [JRDUS Non-J&J]
    > Subject: Re: [gridengine users] John's cores pe (Was: users Digest...)
    >
    >
    > Am 17.12.2016 um 11:34 schrieb Reuti:
    >
    >>
    >> Am 17.12.2016 um 02:01 schrieb John_Tai:
    >>
    >>> It is working!! Thank you to all that replied to me and helped me 
figure this out.
    >>>
    >>> I meant to set the default to 2G so that was my mistake. I changed it 
to:
    >>>
    >>> virtual_free        mem        MEMORY    <=    YES         YES        
2G       0
    >>
    >> That's strange. A plain "2" was for me always two bytes. A "h_vmem" of 2 
bytes would crash the job instantly when it got scheduled, but for 
"virtual_free" (which is only a guidance for SGE how to distribute jobs) it 
shouldn't hinder the scheduling at all.
    >>
    >> `man sge_types` also lists:
    >>
    >>      If no multiplier is present, the value is  just  counted  in bytes.
    >
    > We have set "-w e" in /usr/sge/default/common/sge_request, and then I 
even face an "Unable to run job: error: no suitable queues." This happens 
whether the low 2 byte value is specified in the complex definition `qconf -mc` 
or on the command line as "-l virutal_free=2".
    >
    > It turns out, that the minimum value which is being accepted is: 33.
    >
    > -- Reuti
    >
    >
    >>
    >>> And it's working now. Although I'm not sure why it affected the PE.
    >>>
    >>> Also I didn't set a global one, what is the purpose of the global one? 
Should I set it?
    >>
    >> No, it was only one place I would have checked too. The global complexes 
therein can for example be used for a limit in the number of licenses of an 
application you have and which can be used floating in the cluster (one could 
prefer to put such a limit in an RQS though).
    >>
    >> If you would have set it up there, it would have been the "overall limit 
of memory which can be used in the complete cluster at the same time".
    >>
    >> -- Reuti
    >>
    >>
    >>> # qconf -se global
    >>> hostname              global
    >>> load_scaling          NONE
    >>> complex_values        NONE
    >>> load_values           NONE
    >>> processors            0
    >>> user_lists            NONE
    >>> xuser_lists           NONE
    >>> projects              NONE
    >>> xprojects             NONE
    >>> usage_scaling         NONE
    >>> report_variables      NONE
    >>>
    >>>
    >>> -----Original Message-----
    >>> From: Reuti [mailto:re...@staff.uni-marburg.de]
    >>> Sent: Friday, December 16, 2016 7:36
    >>> To: John_Tai
    >>> Cc: Christopher Heiny; users@gridengine.org; Coleman, Marcus [JRDUS 
Non-J&J]
    >>> Subject: Re: [gridengine users] John's cores pe (Was: users Digest...)
    >>>
    >>>
    >>>> Am 16.12.2016 um 09:53 schrieb John_Tai <john_...@smics.com>:
    >>>>
    >>>> virtual_free        mem        MEMORY    <=    YES         YES        
2        0
    >>>
    >>> This would mean, that the default consumption is 2 bytes. I already 
feared that a high values was programmed here. More suitable would be a default 
of 1G or so.
    >>>
    >>> Is there any virtual_free complex defined on a global level: qconf -se 
global
    >>>
    >>> -- Reuti
    >>> ________________________________
    >>>
    >>> This email (including its attachments, if any) may be confidential and 
proprietary information of SMIC, and intended only for the use of the named 
recipient(s) above. Any unauthorized use or disclosure of this email is 
strictly prohibited. If you are not the intended recipient(s), please notify 
the sender immediately and delete this email from your computer.
    >>>
    >>
    >>
    >> _______________________________________________
    >> users mailing list
    >> users@gridengine.org
    >> https://gridengine.org/mailman/listinfo/users
    >
    > ________________________________
    >
    > This email (including its attachments, if any) may be confidential and 
proprietary information of SMIC, and intended only for the use of the named 
recipient(s) above. Any unauthorized use or disclosure of this email is 
strictly prohibited. If you are not the intended recipient(s), please notify 
the sender immediately and delete this email from your computer.
    >


    _______________________________________________
    users mailing list
    users@gridengine.org
    https://gridengine.org/mailman/listinfo/users



This electronic message is intended for the use of the named recipient only, 
and may contain information that is confidential, privileged or protected from 
disclosure under applicable law. If you are not the intended recipient, or an 
employee or agent responsible for delivering this message to the intended 
recipient, you are hereby notified that any reading, disclosure, dissemination, 
distribution, copying or use of the contents of this message including any of 
its attachments is strictly prohibited. If you have received this message in 
error or are not the named recipient, please notify us immediately by 
contacting the sender at the electronic mail address noted above, and destroy 
all copies of this message. Please note, the recipient should check this email 
and any attachments for the presence of viruses. The organization accepts no 
liability for any damage caused by any virus transmitted by this email.

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to