Re: [gridengine users] how to reserve all cluster slots for maintenance?

2013-04-25 Thread Reuti
Am 25.04.2013 um 00:19 schrieb Dave Love: > Reuti writes: > >>> I guess a calendar is the simplest option then? Will SGE refrain from >>> scheduling jobs if I create a "maintenance" calendar and connect it to >>> all queues? >> >> If "max_reservation"" is set to a value different from zero: ye

Re: [gridengine users] WALLTIME by qacct?

2013-04-25 Thread Riccardo Murri
Hi, On 25 April 2013 18:56, Sangamesh Banappa wrote: > If a user runs a serial job for 10 hours, and another user runs a > parallel job of 200 cores for 10 hours, then gridengine accounting shows it > as same WALLTIME of 10 hours for both jobs. How? This is could be severly > wrong.. WAL

Re: [gridengine users] Monitor GPU memory usage per job

2013-04-25 Thread Dave Love
Stephen Willey writes: > You could use a load sensor to do this. We use one to detect if > people are logged in and suspend/requeue the jobs if someone logs in > while a job's on their workstation. I don't understand how that addresses the question (as I understand it). > http://arc.liv.ac.uk/

Re: [gridengine users] WALLTIME by qacct?

2013-04-25 Thread Sangamesh Banappa
Hi, - Original Message - > Sangamesh Banappa writes: > > > Hi, > > > > > > I need some details on the accounting data captured by GridEngine. > > The > > qacct output for the last 90 days is: > > Please make a bug report with suggestions for improvement if it's not > adequately explaine

Re: [gridengine users] Monitor GPU memory usage per job

2013-04-25 Thread Dave Love
Nicolás Serrano Martínez-Santos writes: > Hi, > > We are currently using SGE6.2u5 in our little cluster (~150 cores) and I am > trying to configure it to manage GPU correct usage. I have been able to define > multiple slots for each GPU card and also to reserve memory using > consumables. > > H

Re: [gridengine users] power management

2013-04-25 Thread Dave Love
Reuti writes: > Some time ago Fritz mentioned that SCM to control it is abandoned and > Univa integrated something new to control it. SDM is still available if you want it, and the others I'm aware of are on , apart from one funded with my taxes which I could

Re: [gridengine users] suspension / load balancing problem

2013-04-25 Thread Dave Love
Reuti writes: > For serial and parallel SMP jobs there is: > http://wiki.gridengine.info/wiki/index.php/StephansBlog as an > option. Maybe instead of using "slots" a custom complex is necessary > which is called "medium" and all jobs of this type have to request > it. This is similar to your setu

Re: [gridengine users] how to reserve all cluster slots for maintenance?

2013-04-25 Thread Dave Love
Riccardo Murri writes: > The default duration is 3.25 days so that cannot affect ARs one month > from now... > >> But I must admit, that it's strange that it's blocked at a later >> point in time. There is no other AR in the way I assume. > > No other AR at the moment. If this applies to the cur

Re: [gridengine users] how to reserve all cluster slots for maintenance?

2013-04-25 Thread Dave Love
Reuti writes: >> I guess a calendar is the simplest option then? Will SGE refrain from >> scheduling jobs if I create a "maintenance" calendar and connect it to >> all queues? > > If "max_reservation"" is set to a value different from zero: yes. Then > it will take the set calendar into account

Re: [gridengine users] orphan tmp directories

2013-04-25 Thread Dave Love
Reuti writes: > On the exechost? I don't do it at all on a per job basis. In case your users > fight for the disk space you can implement a consumable for the disk space in > combination with a load sensor: > > http://gridengine.org/pipermail/users/2012-February/002914.html For what it's worth

[gridengine users] TMPDIR naming change (was: orphan tmp directories)

2013-04-25 Thread Dave Love
Reuti writes: > You mean one level above - in /tmp or alike? The $TMPDIR name is > usuall $JOB_ID.${SGE_TASK_ID/undefined/1}.$QUEUE. I've changed that to use the cell name, not the queue for SGE 8.1.4. If anyone thinks that will break more than it fixes, please speak up. -- Community Grid Eng

Re: [gridengine users] modify host selection algorithm?

2013-04-25 Thread Dave Love
Brett Taylor writes: > Thanks, I already had $pe_slots in my smp definition, so I guess it's > already doing this more or less. It still seems to be a little bit > more random than I'd like, i.e. sometimes it starts them on empty > hosts, sometimes it adds them to hosts that are running things b

[gridengine users] Debian packages (was: Open MPI jobs randomly fail to run)

2013-04-25 Thread Dave Love
Bernard Massot writes: > I'm using gridengine 6.2u5 on Debian Squeeze. I recommend not using that for various reasons. You can build Debian packages from the SGE 8.1.3 distribution (installing into /opt). Unfortunately it looks as if SGE in Debian is dead -- no-one seems able to upload the new

Re: [gridengine users] Execd on Windows 7 using SFU

2013-04-25 Thread Dave Love
Joe Borġ writes: > Hi Guys, > > I was wondering if anyone has any guidance with installing execd onto > Windows 7 using Microsoft's SFU. In what respect? If you want to build it, see source/README.windows in SGE 8.1.3. Otherwise you probably need to look at the Oracle docs for more information

Re: [gridengine users] Threshold "T" State Locks User out of SSH to Execution Host - How to Disable?

2013-04-25 Thread Dave Love
Adam Brenner writes: > In our case, it is very helpful for our users to directly SSH into the > nodes to determine what is wrong with their qsub scripts, etc. This is > a follow up the following thread by Joseph and Harry: > https://gridengine.org/pipermail/users/2013-February/005585.html

Re: [gridengine users] WALLTIME by qacct?

2013-04-25 Thread Dave Love
Sangamesh Banappa writes: > Hi, > > > I need some details on the accounting data captured by GridEngine. The > qacct output for the last 90 days is: Please make a bug report with suggestions for improvement if it's not adequately explained under http://arc.liv.ac.uk/SGE/htmlman/ -- Community

Re: [gridengine users] PE only offers 0 slots

2013-04-25 Thread Dave Love
Jesse Becker writes: > This is a problem that has plagued me, and various people time and > again, and every time it gets fixed, the method and cause seems to get > lost in the aether. I've hit it several times over the years, and each > time the problem and solution see to vanish... I'm afraid

Re: [gridengine users] run gamess under GE

2013-04-25 Thread Dave Love
Mahbube Rustaee writes: > Hi all, > How can run GAMESS under GE? You need to be specific about which GAMESS, and how it was built, at least. -- Community Grid Engine: http://arc.liv.ac.uk/SGE/ ___ users mailing list users@gridengine.org https://grid

[gridengine users] The newbie is back again and again

2013-04-25 Thread Jacques Foucry
Hello folks, My SGE installation go forward (a little bit). I had successfully install an "Install Master" which allow me to create distibution tarball (install form source using aimk and all the stuff that going around). I used this tutorial: http://bioteam.net/2012/01/building-open-grid-sc

Re: [gridengine users] suspension / load balancing problem

2013-04-25 Thread Tina Friedrich
Hi Reuti, >> I have a bit of a problem with our job submission. We have a setup with four different 'priority' queues - very low, >> low, medium, and high - with subordination. The setup actually works quite well for our usage pattern - with the highest priority queue being reserved for autom