Re: [gridengine users] qsub -V doesn't set $PATH

2020-01-22 Thread Skylar Thompson
wrote: > On Tue, Jan 21, 2020 at 03:51:01PM +, Skylar Thompson wrote: > > -V strips out PATH and LD_LIBRARY_PATH for security reasons, since prolog > > I don't think this is the case. I've just experimented with one of our 8.1.9 > clusters and I can set arbitrary PAT

Re: [gridengine users] qsub -V doesn't set $PATH

2020-01-21 Thread Skylar Thompson
> qsub -V ${other_options_omitted} ./my_command > > doesn't set $PATH on the remote job. Is that expected behavior? > > Thanks - > Adam Shiel > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mail

Re: [gridengine users] CPU and Mem usage for interactive jobs

2019-12-09 Thread Skylar Thompson
eed something > different in the GE configuration to enable this? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _

Re: [gridengine users] What is the easiest/best way to update our servers' domain name?

2019-10-25 Thread Skylar Thompson
g > > https://gridengine.org/mailman/listinfo/users > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Scien

Re: [gridengine users] limit CPU/slot resource to the number of reserved slots

2019-08-29 Thread Skylar Thompson
obs. > > Best regards, > Mikhail Serkov > > > On Aug 29, 2019, at 10:20 AM, Skylar Thompson wrote: > > > > Load average gets high if the job spawns more processes/threads than > > allocated CPUs, but we haven't seen any problem with node instability. W

Re: [gridengine users] limit CPU/slot resource to the number of reserved slots

2019-08-29 Thread Skylar Thompson
elf. > > > > Dan > > > > > > On Mon, Aug 26, 2019 at 12:46 PM Dietmar Rieder > > wrote: > > Hi, > > > > thanks for your reply. This sounds promising. > > We are using Son of Grid Engine though. Can you point me to the right > >

Re: [gridengine users] limit CPU/slot resource to the number of reserved slots

2019-08-26 Thread Skylar Thompson
should be possible in slurm (which we don't have, > and to which we don't want to switch to currently). > > Thanks > Dietmar -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Buildi

Re: [gridengine users] Multi-GPU setup

2019-08-14 Thread Skylar Thompson
> Is there something like "cgroups" for gpus? > > Thanks, > > -Dj > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar

Re: [gridengine users] Sorting qhost and choosing qstat columns

2019-08-01 Thread Skylar Thompson
> > users mailing list > > users@gridengine.org > > https://urldefense.proofpoint.com/v2/url?u=https-3A__gridengine.org_mailman_listinfo_users&d=DwIGaQ&c=mkpgQs82XaCKIwNV8b32dmVOmERqJe4bBOtF0CetP9Y&r=EKk3zFVROsf8w5OyB2T6u55jzploih3y7CaWIlGOLAY&m=m_wX_jPeroRg

Re: [gridengine users] Sorting qhost and choosing qstat columns

2019-08-01 Thread Skylar Thompson
by eliminating the "jclass" column, which > doesn't contain any information, but I can only find ways to add columns, > not take them away. Is there a way to make this column go away? > > _______ > users mailing list > user

Re: [gridengine users] Different ulimit settings given by different compute nodes with the exactly same /etc/security/limits.conf

2019-07-03 Thread Skylar Thompson
rs mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine __

Re: [gridengine users] scripting help with per user job submit restrictions

2019-06-13 Thread Skylar Thompson
*tar.gz > do >qsub -l h_vmem=4G -cwd -j y -b y -N tar -R y -q all.q,gpu.q "tar -xzf $i" > done > > Hope to hear from you soon. > > Regards > Varun > ___ > users mailing list > users@gridengine.org > http

Re: [gridengine users] I need a decoder ring for the qacct output

2019-04-25 Thread Skylar Thompson
uld (UTIME + STIME) >= WALLCLOCK? It isn't in my case and is mainly > why I am confused. Or perhaps process wait time is not included? It's going to depend on your operating system, but STIME tends to include I/O wait. Note, though, that user and system time are measured in terms

Re: [gridengine users] I need a decoder ring for the qacct output

2019-04-25 Thread Skylar Thompson
a complete reference (just bits > and pieces here and there). > > Can anyone point me to a complete reference so that I can better > understand the output of qacct? > > Thank you, > > -- > Mun > > ___ > users mailing

Re: [gridengine users] Different GDI version between client and qmaster

2019-02-26 Thread Skylar Thompson
___ > From: users-boun...@gridengine.org on behalf of Skylar Thompson > > Sent: Tuesday, February 26, 2019 7:14 PM > To: users@gridengine.org > Subject: Re: [gridengine users] Different GDI version between client and > qmaster > > Do you have different versions of GE insta

Re: [gridengine users] Different GDI version between client and qmaster

2019-02-26 Thread Skylar Thompson
gt; > Thanks in advance > > Rad > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -

Re: [gridengine users] Accessing qacct accounting file from login/compute nodes

2019-02-21 Thread Skylar Thompson
> Cheers, > > On Thu, Feb 21, 2019 at 3:20 AM Skylar Thompson > wrote: > > > We actually don't have a shared $SGE_ROOT, so that in the event of > > network/storage trouble the binaries and libraries are still accessible. We > > do have $SGE_ROOT/$SGE_CELL o

Re: [gridengine users] Accessing qacct accounting file from login/compute nodes

2019-02-20 Thread Skylar Thompson
euti > > > > > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > > ___ > users mailing list > users@gridengine.org > h

Re: [gridengine users] starting a new gridengine accounting file

2019-01-29 Thread Skylar Thompson
On Tue, Jan 29, 2019 at 12:06:47PM -0500, John Young wrote: > On 1/29/19 11:15 AM, Skylar Thompson wrote: > > Hi John, > > > > Have you looked at using the ${SGE_ROOT}/util/logchecker.sh script? There's > > documentation on setting it up in doc/logfile-trimming.

Re: [gridengine users] starting a new gridengine accounting file

2019-01-29 Thread Skylar Thompson
have looked around in the Gridengine docs > for information on how to close it and start another file > but if it is there, I missed it. > > Does anyone know how to do this? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege B

Re: [gridengine users] Grid Engine Sluggish

2019-01-28 Thread Skylar Thompson
55m/s APT: 0.0011s/m idle: 99.94% wait: 0.00% time: 76.48s > > > > > > Joseph > > > > > > > > > ___ > > > users mailing list > > > users@gridengine.org > > > https://gridengine.org/mailman/li

Re: [gridengine users] Alternatives to Son of GridEngine

2018-11-13 Thread Skylar Thompson
too new. > > > > Dan > > > > > > _______ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > ___

Re: [gridengine users] Odd commlib HOST_NOT_RESOLVABLE error

2018-08-24 Thread Skylar Thompson
New York, NY 10003 > > "In an open world, who needs windows or gates ?" > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.

Re: [gridengine users] different results on terminal and on submission via qsub

2018-07-09 Thread Skylar Thompson
AAGAAGGTAACATGTTTTAAGAAACTATGTAGCATAGTGTCTT > > What is it that I am doing wrong here. > > Thanks > > Regards > Varun > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Sky

Re: [gridengine users] Distributed NFS file system for master/slave

2018-07-09 Thread Skylar Thompson
e server fixed this behavior. > > Thanks for sharing, > > Paul. > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Departme

Re: [gridengine users] window size, qlogin and X11 forwarding

2018-05-26 Thread Skylar Thompson
n apply there. > > Stuart Barkley > -- > I've never been lost; I was once bewildered for three days, but never lost! > -- Daniel Boone > ___ > users mailing list > users@gri

Re: [gridengine users] SGE accounting file getting too big...

2018-05-18 Thread Skylar Thompson
p. > > > > > > -Noel Benitez, Salk iT Dept. > > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator

Re: [gridengine users] Debugging crash when running program through GridEngine

2018-05-08 Thread Skylar Thompson
hen see that it was trying to do a memory map at the point where > it died so I knew where to start playing. > > Thanks again for the help. I've learned some new techniques to try for next > time! > > Simon. > > > -Original Message- > From: users-boun.

Re: [gridengine users] Debugging crash when running program through GridEngine

2018-05-04 Thread Skylar Thompson
tem. The contents of this e-mail are the views of the > sender and do not necessarily represent the views of the Babraham Institute. > Full conditions at: www.babraham.ac.uk<http://www.babraham.ac.uk/terms> > _______ > users mailing list >

Re: [gridengine users] mpirun without ssh

2018-03-22 Thread Skylar Thompson
If you are trying to restrict users to only being able to access a node > that they have a job running on, there is a PAM module for that. > > Ian > > On Thu, Mar 22, 2018 at 8:23 AM, Skylar Thompson > wrote: > > > You'll need to make sure that your MPI implementa

Re: [gridengine users] mpirun without ssh

2018-03-22 Thread Skylar Thompson
sshd_config to do that, but now mpi is not working ! > > What can I do to refuse ssh connexion on nodesĀ  to my userĀ  and to have mpirun > working ??? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-

Re: [gridengine users] Converting from supplemental groups to cgroups for management

2018-02-15 Thread Skylar Thompson
some web pages which stated that for some > versions of gridengine and kernel this was a Bad Idea. > > Sincerely, > > Calvin Dodge > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listi

Re: [gridengine users] Temporarily stop new job submissions?

2018-01-23 Thread Skylar Thompson
t; > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > ___ > users mailing list > users@gridengin

Re: [gridengine users] Temporarily stop new job submissions?

2018-01-23 Thread Skylar Thompson
to do this? > > > --Chet Langin, SIU > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Admi

Re: [gridengine users] How to clear node (E)rror status?

2017-05-01 Thread Skylar Thompson
"qw" to "r" even though nodes were > available. He also had complained about a disk quota problem, so that may, > or may not be related. (Although we have had other users run into disk > quotas without this happening.) > > > Can someone tell me how to

Re: [gridengine users] default queues

2016-12-20 Thread Skylar Thompson
gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine _

Re: [gridengine users] Strange issue with one node

2016-10-24 Thread Skylar Thompson
nvironment: orte range: 8 > version:3 > scheduling info:cannot run in PE "orte" because it only > offers 7 slots > > > I've search on all of the configuration of SGE. I do too the > reinstalation of the 2 nodes. But the same messag

Re: [gridengine users] jobs not running even though resource quotas not met

2016-10-21 Thread Skylar Thompson
situations where a combinations of > > consumables and limits in RQS blocks the scheduling completely and showing > > something like "... offers only (-l none)." > > > > In case you have to limit the usage per user you have to use them for sure. > > > > OK

Re: [gridengine users] jobs not running even though resource quotas not met

2016-10-18 Thread Skylar Thompson
ntinue > debugging? Thanks. > > -M > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator

Re: [gridengine users] Control tmpdir usage on SGE

2016-09-08 Thread Skylar Thompson
dd, make a filesystem on it, and mount it over a loopback device at TMPDIR. A wrapper script executed through sudo would probably be pretty secure to run as the user in prolog. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Buil

Re: [gridengine users] Reported memory usage too high

2016-06-02 Thread Skylar Thompson
>> > >> > > -- > > Alex Chekholko ch...@stanford.edu > > > > ___ > > users mailing list > > users@gridengine.org > > https://gridengine.org/mailman/listinfo/users > > > ___

Re: [gridengine users] How to set a minimum free memory limit for any task submission on SGE?

2016-06-01 Thread Skylar Thompson
gt; it really depends on the size of the dataset being worked on which often > can't be predetermined. > > is there a way to do it? thanks!!! > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mai

Re: [gridengine users] Fwd: dispatching sge task from an sge task - is that a reasonable practice?

2016-02-25 Thread Skylar Thompson
, > checking for files that indicate that a prerequisite job finished, > checking for errors in the prereq, or loop over 'qstat' checking if a > specific jobid has completed. Yep, we recommend folks use -hold_jid whenever possible. Some of our labs also use DRMAA but obviously that int

Re: [gridengine users] Fwd: dispatching sge task from an sge task - is that a reasonable practice?

2016-02-25 Thread Skylar Thompson
queue slots and not leaving room for analyze.month tasks which > they will forever wait for), also besides dispatching they also do some > logic so it's a strange animal, this "dispatching" queue.. > > What's the "correct" practice here? > ___

Re: [gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Skylar Thompson
On Mon, Jan 25, 2016 at 10:17:16PM +0100, Reuti wrote: > > Am 25.01.2016 um 20:34 schrieb Skylar Thompson: > > > Yep, we use functional tickets to accomplish this exact goal. Every user > > gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though > > yo

Re: [gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Skylar Thompson
Thanks very much! > Chris > > > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu)

Re: [gridengine users] behavior of suspended jobs?

2015-11-02 Thread Skylar Thompson
ped to local disk? Will > it restart (un-suspend) on that node when sufficient resources are free? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Wa

Re: [gridengine users] Trying to code C program to process SGE job-status email

2015-10-27 Thread Skylar Thompson
ven queue? Would this data be stored in the head > node mySQL > DB? > > Thanks again Reuti. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine

Re: [gridengine users] Is qacct 'maxvmem' reporting reliable?

2015-10-26 Thread Skylar Thompson
show this parameter at all, so can I assume it's > false, which means qacct's maxvmem should actually reflect the actual max > vmem usage? What might I be doing wrong or misunderstanding? > > Thanks. > > -M > ___ > users ma

Re: [gridengine users] At what point does the network overhead of adding additional nodes to a queue offset the benefit?

2015-09-25 Thread Skylar Thompson
of nodes in a queue negatively affects the performance of the queue? Is there > any general > rule about how many nodes to have in a queue based on a given network > backbone? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege

Re: [gridengine users] Dealing with CUDA and virtual memory

2015-08-20 Thread Skylar Thompson
to get around this? Would cgroups support help? > > Thanks, > Brendan > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Science

Re: [gridengine users] question about managing queues

2015-08-03 Thread Skylar Thompson
elps. GE scheduling definitely can feel like black magic sometimes. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine

Re: [gridengine users] error: fopen("/opt/sge/default/common/bootstrap") failed: No such file or directory

2015-06-02 Thread Skylar Thompson
> > And my whole grid went down with the bootstrap message. > > I am asking for a restore from the other group, but I need to understand > maybe what I did, and can I fix it. They can take days to do a restore > and this is a production system arggg > > Thanks, > Dan

Re: [gridengine users] error: fopen("/opt/sge/default/common/bootstrap") failed: No such file or directory

2015-06-02 Thread Skylar Thompson
other group is backing it up as > requested). > > Dan > > On 06/02/2015 11:05 AM, Skylar Thompson wrote: > > Did your SGE_ROOT and/or SGE_CELL environment variable settings change? All > > the GE binaries expect to find the bootstrap file at > > ${SGE_ROOT}/${SGE_CE

Re: [gridengine users] error: fopen("/opt/sge/default/common/bootstrap") failed: No such file or directory

2015-06-02 Thread Skylar Thompson
strap file does not exist. > > What did I do and how do I recover? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine __

Re: [gridengine users] Grid queue goes into an error state due to one job

2015-05-18 Thread Skylar Thompson
uting The Wharton School University > > of Pennsylvania > > The information contained in this electronic message and any attachments to > > this message are intended for the exclusive use of the addressee(s) and may > > contain proprietary, confidential or privileged information. If you are not >

Re: [gridengine users] Doubts regarding the h_vmem allocation while submitting the job in parallel environment

2015-04-21 Thread Skylar Thompson
of viruses. The company accepts no liability for any damage caused > by any virus transmitted by this email. www.wipro.com > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -

Re: [gridengine users] "Distribution" of processes to nodes

2015-03-20 Thread Skylar Thompson
number of complete nodes?" > > We are not sure how to achieve this. Could you please give a any hint? > > TI & Kind Regards, > Christian > > -- > No signature available. > _______ > users mailing list > users@gr

Re: [gridengine users] suggestions on setting up queues

2015-01-20 Thread Skylar Thompson
t only so many of these > >> per user > >> > >> Any and all suggestions are welcome. > >> > >> Thank you! > >> > >> Best, > >> -- > >> Stephen Spencer > >> spen...@cs.washington.edu <mailto:spen...@cs.washi

Re: [gridengine users] SGE and NFS

2014-11-12 Thread Skylar Thompson
any of it need to? > Maybe just the var part would need to: /cm/shared/apps/sge/var ? > > Thanks, > Eric > > > > _______ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Sk

Re: [gridengine users] wiki is back, now hopefully with far more resiliency

2014-07-25 Thread Skylar Thompson
ge and hypervisor kit. > > Allow me to say a very public "Thanks!" for maintaining such a great > resource. I'd like to second that. I've been scraping by with Google cache and the Wayback Machine. It sounds like this is a volunteer operation too, which makes me doub

Re: [gridengine users] Enforce users to use specific amount of memory/slot

2014-06-30 Thread Skylar Thompson
ble. > > 40GB VIRT vs 100MB RES is a huge difference! I thought I had it bad with > matlab using 4GB VIRT for 100MB RES. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- Un

Re: [gridengine users] Alternative to h_vmem?

2014-06-06 Thread Skylar Thompson
h_vmem value because Matlab > won't launch. > > Thanks for any thoughts. > > -M > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- G

Re: [gridengine users] Algorithm for deciding total number of functional tickets?

2014-06-06 Thread Skylar Thompson
ow many tickets should be in the > functional ticket pool? I've seen figures ranging from 100 to > 1,000,000. > > Any advice and/or replies would be appreciated. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Departm

Re: [gridengine users] Scheduler

2014-05-06 Thread Skylar Thompson
; Robi > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 --

Re: [gridengine users] sge_qmaster uses too much memory and becomes unresponsive

2014-04-02 Thread Skylar Thompson
mit jobs with slot ranges in the PE > request? I'm also a big fan of schedd_job_info, but I'm a bigger fan > of my scheduler not blowing up. > > -- > Joshua Baker-LePain > QB3 Shared Cluster Sysadmin > UCSF > ___ >

Re: [gridengine users] sge_qmaster uses too much memory and becomes unresponsive

2014-04-02 Thread Skylar Thompson
any attachments for the presence of viruses. The organization > accepts no liability for any damage caused by any virus transmitted by this > email. > = > > > ___ > users mailing list > users

Re: [gridengine users] array job / node allocation / 'spread' question

2014-04-02 Thread Skylar Thompson
t; of a gotcha). > > I've figured out in the meantime that it can be achieved by > submitting an AR for a one-slot-per-node PE, and then submitting the > array into that. Not sure which option I favour, still. > > Tina > > On 02/04/14 16:04, Skylar Thompson wrote: >

Re: [gridengine users] array job / node allocation / 'spread' question

2014-04-02 Thread Skylar Thompson
the time, but only want one of mine > (network IO intensive tasks, best use of file system would be lots > of them but spread as far and wide as the can). > > I've thought about introducing a consumable - apart from there's no > node-level consumables at the moment - but am un

Re: [gridengine users] (Seemingly) Random failures of OpenMPI jobs

2014-01-08 Thread Skylar Thompson
On Tue, Jan 07, 2014 at 03:19:23PM -0800, Joshua Baker-LePain wrote: > On Tue, 7 Jan 2014 at 3:09pm, Skylar Thompson wrote > > > Quick question - are you limiting memory usage for the job (i.e. h_vmem)? > > No. We have mem_free set to consumable (and the jobs include

Re: [gridengine users] (Seemingly) Random failures of OpenMPI jobs

2014-01-07 Thread Skylar Thompson
erally ends up running anyway. > > Any ideas as to how to track this down? I'm a bit stumped... > > Thanks. > > -- > Joshua Baker-LePain > QB3 Shared Cluster Sysadmin > UCSF > ___ > users mailing list > users@gridengine.or

Re: [gridengine users] Linux OOM killer oom_adj

2012-08-30 Thread Skylar Thompson
Oracle Java is particularly heinous when it comes to virtual memory allocation. The Oracle Java that ships with RHEL6 x86_64 requests around 23GB of memory even when it's run with just "-version". IBM Java is a bit more reasonable, only requesting around 3GB to report its versi

Re: [gridengine users] commlib errors?

2012-07-12 Thread Skylar Thompson
We tried doing that, and it fixed most of our issues, but not the SGE ones. We're seeing very high CPU load on the sge_shepherd and sge_execd processes. We were seeing high load on our sge_qmaster process until we restarted. -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sci

Re: [gridengine users] commlib errors?

2012-07-12 Thread Skylar Thompson
s knowledgebase article for more info: https://access.redhat.com/knowledge/articles/15145 We're not totally certain this is the issue, but it's highly correlated in time with the leap second. -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrat

Re: [gridengine users] load sensor for weather forecast?

2012-06-29 Thread Skylar Thompson
We haven't done this (our cooling fails plenty often but is somewhat independent of the outside temperature at this point), but I wonder if you could use an advance reservation to accomplish this? Otherwise you would have hardware that's idling before the forecast comes to pass.