I have been hacking at this for a couple weeks now and I want to confirm
something.
I have 4 pieces of hardware that can be consumed.
Any box in the grid can access one of the pieces of hardware so I
created a PE's that would allow me to select them using the wild card
annotation.
Your made a comment perhaps in a later thread about not adding any more
IXIA's PE's would drive me crazy. I get your point now.
I created these PE's
14_14, 1415_14, 141518_14, 14151819_14, 1418_14,
141819_14, 1419_14,
1415_15, 141518_15, 14151819_15, 15_15, 1518_15,
151819_15, 1519_15,
141518_18, 14151819_18, 1418_18, 141819_18, 1518_18,
151819_18, 18_18, 1819_18,
14151819_19, 141819_19, 1419_19, 151819_19, 1519_19,
1819_19, 19_19
And I created some resource quota's
limit pes {14_14,1415_14,141518_14,14151819_14,1418_14,
141819_14,1419_14} to slots=1
limit pes {1415_15,141518_15,14151819_15,15_15,1518_15,
151819_15,1519_15} to slots=1
limit pes
{141518_18,14151819_18,1418_18,141819_18,1518_18,151819_18,18_18,1819_18
} to slots=1
limit pes
{14151819_19,141819_19,1419_19,151819_19,1519_19,1819_19,19_19} to
slots=1
How I read these were, limit ANY of the PE in the list to just one slot.
Which I though would mean only allow one _14 PE from any of the PE's
I think it is really interpreted as limit ALL of the PE's in the list to
just one slot, meaning that I could have jobs wanting to run on 14_14
and 1415_14 simultaneously.
and then I migrated to the resource quota's to use "jobs=1" and created
a complex attribute jobs = 99999 the behavior stayed the same.
gridengine certainly knows that if I submit 100 qsubs using -pe
14151819_*, that it can only run in one of the 4 environments and it
knows one jobe per PE. If I also go an run another set of subs using
-pe 1415_*, it will only attempt to run 2 jobs at a time using those
parallel environments. However, the combination of qsubs using
14151819_* and 1415* results in collision on _14 and _15 PE's
It is like I need a _14, _15, _18 and _19 consumable, however, I don't
know which parallel environment I am going to get until after the job
starts so it' chicken and egg. I can't add a consumable switch to the
qsub for an unknown environment.
Any thoughts would be appricated.
-----Original Message-----
From: Reuti [mailto:[email protected]]
Sent: Thursday, February 23, 2012 3:57 AM
To: William Hay
Cc: Maes, Richard; [email protected]
Subject: Re: [gridengine users] Tricky consumables problem
Am 23.02.2012 um 10:38 schrieb William Hay:
> On 23 February 2012 09:31, Reuti <[email protected]> wrote:
>> Am 23.02.2012 um 10:01 schrieb William Hay:
>>
>>> On 23 February 2012 00:36, Maes, Richard <[email protected]> wrote:
>>>> Reuti,
>>>> For the example below where you spec which PE to instantiate.
>>>>> $ qsub -pe ixia* 1 job.sh
>>>>
>>>> Can this accept something other than wildcards? Is there a way to
make
>>>> it do REGEX? Or ranges?
>>>
>>>
>>>> For a case where I have Ixia1, Ixia2, and Ixia3, and lets says I
want
>>>> one group of tests to use Ixia1 and Ixia2. I have a second group
of
>>>> tests I want to run on Ixia3,
>>>> I would be cool to do
>>>> Qsub -pe ixia[12] 1 job.sh
>>> You can transform the pe name into that from a server side JSV and
it
>>> works as you would want.
>>> However it will be rejected on job submission so you would need some
>>> other way to pass this in
>>> if you want to allow the user to set it possibly a context variable.
>>> This falls under the heading of undocumented (and I'm sure
unsupported) feature.
>>
>> Why should this be undocumented? -ac is documented.
>>
> That the pe is interpreted as a full pattern (per sge_types) which can
> be set to ixia[12]
> from the server side JSV is the undocumented part. Sorry if I was
unclear.
Argh, this was an extension beyond 6.2u5 and the pe_name can be any
object_name also on the command line IIRC. You can request ixia[12] to
get one of them, but not both (unless you transform it by the JSV). The
sge_types list only object_name for it.
For you it's working in the JSV but not on the command line - which
version of SGE are you using?
-- Reuti
> William
>
>
>>
>>
>>>> Or even qsub -pe {ixia1,ixia2} 1 job.sh,
>>> Alternatively you might just associate each PE with a separate queue
>>> and restrict the queues that
>>> the job is allowed to run in (if they're on the same hosts this
might
>>> necessitate additional changes to
>>> ensure they don't get oversubscribed).
>>>>
>>>> But I don't think either of those works.
>>>> Rich
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Reuti [mailto:[email protected]]
>>>> Sent: Friday, February 17, 2012 5:34 PM
>>>> To: Maes, Richard
>>>> Cc: [email protected]
>>>> Subject: Re: [gridengine users] Tricky consumables problem
>>>>
>>>> Am 18.02.2012 um 02:24 schrieb Maes, Richard:
>>>>
>>>>> To answer your question about layout. All nodes can talk to the
>>>> single
>>>>> Ixia. There is a diagram below.
>>>>>
>>>>> To confirm what you are saying here
>>>>>> limit pes {ixia1,ixia2,ixia3} to jobs=1
>>>>>> ("jobs" complex setup as a JOB consumable from an arbitrary high
>>>>
>>>> JOB instead of consumable YES
>>>>
>>>>> value in the global configuration)
>>>>>
>>>>> Are you saying I need to create jobs in the cluster configuration
->
>>>>> global host? And set it to 999999 for instance?
>>>>
>>>> Exactly, this way you can limit "jobs" opposed to "slots" in case
it's
>>>> necessary by RQS for users, pes, queues, hosts. As you used only
one
>>>> slot in your former setup, I wasn't sure how you would run parallel
jobs
>>>> then (besides using just a granted node completly for this one-slot
job
>>>> then).
>>>>
>>>> -- Reuti
>>>>
>>>>> ___________
>>>>> [ WA-GRID ]
>>>>> [queuemaster] ___________________
>>>>> [wasim01 ] <-NETWORK-> | WA-IXIA-01 |
>>>>> [wasim02 ] |slot 1 - ports 1 -4| <-- Eth x 4
-->
>>>> DUT
>>>>> 1
>>>>> [wasim03 ] |slot 2 - ports 1 -4| <-- Eth x 4
-->
>>>> DUT
>>>>> 2
>>>>> [wasim04 ] |slot 3 - ports 1 -4| <-- Eth x 4
-->
>>>> DUT
>>>>> 3
>>>>> [wasim05 ] ____________________
>>>>> [wasim06 ]
>>>>> [wasim07 ]
>>>>> [wasim08 ]
>>>>> ____________
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Reuti [mailto:[email protected]]
>>>>> Sent: Friday, February 17, 2012 5:05 PM
>>>>> To: Maes, Richard
>>>>> Cc: [email protected]
>>>>> Subject: Re: [gridengine users] Tricky consumables problem
>>>>>
>>>>> Hi Richard,
>>>>>
>>>>> Am 18.02.2012 um 01:28 schrieb Maes, Richard:
>>>>>
>>>>> I am using gridengine to load tests on to an Ixia. Only one test
can
>>>>>> run at time so I configured the queue to resource quota
configuration
>>>>>> with a rule that says limit users * queues lt_np_13 to slots = 1
>>>>>
>>>>> as it's for all users, the "users *" can be left out here.
>>>>>
>>>>>
>>>>>> This has worked fine for years.
>>>>>>
>>>>>> Now, however I have many Ixia blades and I still want to use
>>>>> gridengine
>>>>>> to distribute jobs to the cluster nodes, but ultimately, there
can
>>>>> only
>>>>>> be one test running per ixia blade.
>>>>>
>>>>> I don't understand the detailed layout of your cluster. There are
>>>>> dedicated nodes connected to each ixia blade?
>>>>>
>>>>> limit hosts @ixia1 to slots=1
>>>>> limit hosts @ixia2 to slots=1
>>>>> limit hosts @ixia3 to slots=1
>>>>>
>>>>> with three hostgroups would do? Or are all nodes connected with
IXIA
>>>> and
>>>>> could use any, but only one as you wrote:
>>>>>
>>>>> Three PEs with a slot limit of one inside the PE could work, and
this
>>>>> could be requested by a wildcard:
>>>>>
>>>>> $ qsub -pe ixia* 1 job.sh
>>>>>
>>>>> The PE name you get inside the job and can select the proper
config
>>>>> file. If these are already parallel jobs, this could be put in an
RQS
>>>>> then:
>>>>>
>>>>> limit pes {ixia1,ixia2,ixia3} to jobs=1
>>>>>
>>>>> ("jobs" complex setup as a JOB consumable from an arbitrary high
value
>>>>> in the global configuration)
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>
>>>>>> I think what I need to do is to use a consumable attribute. A
>>>> command
>>>>>> like qsub -l ixiablade1=1 myjob.sh, will take a consumable
resource
>>>>> for
>>>>>> ixiablade1.
>>>>>> However, lets' say I have three ixia blades, ixiablade1, 2 and 3.
>>>>>> Somehow I need to have a way of saying you can take ixiablade1 or
you
>>>>>> can take ixiablade2 or you can take ixiablade3, but you can only
take
>>>>>> one. Presumably, tests will have to wait until one of the three
>>>>>> consumables is available.
>>>>>>
>>>>>>
>>>>>> Here is a diagram
>>>>>>
>>>>>> [queuemaster]
>>>>>> [wasim01] <-----NETWORK -----> IXIA[wa-ixia-01]
>>>>>> [wasim02] [slot 1 - ports 1 -4]
<--
>>>>>> Direct Ethernet x 4 --> DUT 1
>>>>>> [wasim03] [slot 2 - ports 1 -4]
<--
>>>>>> Direct Ethernet x 4 --> DUT 2
>>>>>> [wasim04] [slot 3 - ports 1 -4]
<--
>>>>>> Direct Ethernet x 4 --> DUT 3
>>>>>> [wasim05]
>>>>>> [wasim06]
>>>>>> [wasim07]
>>>>>> [wasim08]
>>>>>>
>>>>>> For example, I may have 100 individual tests queued to run on the
>>>>> grid,
>>>>>> across 8 different machines, and because of resource quota or
>>>>>> consumables, only three jobs can run at a time. When a job moves
>>>> from
>>>>>> the waiting to running, it will be because one of the consumables
was
>>>>>> available (and now taken again). When the test begins execution,
the
>>>>>> test will examine the environment to determine which consumable
it
>>>> was
>>>>>> given and use that to select the appropriate config file (For
>>>>> slot1.cfg,
>>>>>> slot2.cfg or slot3.cfg) which defines port connections and other
DUT
>>>>>> related configuration information.
>>>>>>
>>>>>> Any thoughts would be appreciated.
>>>>>> Thanks
>>>>>> Rich
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> [email protected]
>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> [email protected]
>>>> https://gridengine.org/mailman/listinfo/users
>>>>
>>>>
>>>
>>
>>
>>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users