Am 15.05.2013 um 13:09 schrieb Tina Friedrich: > Hello list, > > have finally decided to look into upgrading our SGE6.2 installation - mainly > to see if it helps with my job scheduling problem. > > I'm trying to build Son of Grid Engine - succeeded actually. Currently trying > to make it run / import my old configuration. Which mostly worked. Couple of > niggles. > > Our setup is SGE_ROOT on shared NFS file system, SGE running as a non-root > user. I'd quite like to keep it that way (it worked well for us).
The real and effective user is not root? I wonder how to change to a different user during execution then. Often this can be seen: $ ps -e -o user,ruser,group,rgroup,command USER RUSER GROUP RGROUP COMMAND ... sgeadmin root gridware root /usr/sge/bin/lx24-x86/sge_execd > Managed to build & install, got the qmaster running, managed to start execds. > However, at least inst_sge.sh -upd-execd simply refuses to work if you're not > root, if I remember correctly (not helping!). > > Script(s) sometimes say 'You are not installing as user >root< - Can't set > the file owner/group and permissions'. It would help if they'd tell me > (without digging through them) what files they're trying to chown/chmod and > what they're trying to chown/chmod it to - so I can fix that, if there is a > problem. Goes for a lot of these sort of errors (to do with running as > non-root) - if it fails to do something, it would really help to know what it > failed to do. > > The other thing is that I keep having to run it with -nobincheck, as far as I > can tell simply because I didn't build qmon. Annoying - should it not just > check for actually required binaries? > > Importing my old installation / upgrading from my old installation didn't > quite work. Mostly did, it seems, which is something. No error that I'd seen > during the import/upgrade, but none of my queues are there. Host groups are; > exec hosts are; complexes look okay; global config looks right. PEs aren't > there; trying to create the PEs from the config files I originally created > them from I get 'error: required attribute "qsort_args" is missing'. Assume > that's the root problem (i.e. did not manage to import PEs, thus can't import > queues). Anyone else had issues with that? Should the save_config script have > caught that? The "qsort_args" is new therein. You dumped the old configuration using $SGE_ROOT/util/upgrade_modules/save_sge_config.sh? Then it should work to add just this line to the generated textfile for the PEs in the created directory with the text files. > And now for the important question :). My execds currently are a mix of RHEL5 > and RHEL6; SoGE got compiled on RHEL6, doesn't work on RHEL5 execds. Do you use the old original execds or the newly compiled one? If you use the new ones: maybe compiling all on RHEL5 and execute these on RHEL6 might have better chances to work. > Also, all nodes and the master/shadow hosts get software upgrades quite > regularly I would fear that with updates to the nodes all the software you use also need to be revalidated, i.e. running the test-suite for all. Otherwise a change to e.g. a mathematical library may lead to different results after an update. -- Reuti > - I would like to avoid having to recompile SoGE whenever I run yum update > (the old installation is nicely agnostic to all of this, it Just Works(TM) - > well, at least it worked with RHEL5 and RHEL6.) Plus I've installed hwlock in > a non-standard location (and currently have to tell the execd process where > it is). Is there an option for aimk to build statically linked binaries? (I'm > sort of guessing that that's what the difference is here.). > > Apologies for the very long post. > > Tina > > > -- > Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd > Diamond House, Harwell Science and Innovation Campus - 01235 77 8442 > > -- > This e-mail and any attachments may contain confidential, copyright and or > privileged material, and are for the use of the intended addressee only. If > you are not the intended addressee or an authorised recipient of the > addressee please notify us of receipt by returning the e-mail and do not use, > copy, retain, distribute or disclose the information in or attached to the > e-mail. > Any opinions expressed within this e-mail are those of the individual and not > necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot > guarantee that this e-mail or any attachments are free from viruses and we > cannot accept liability for any damage which you may sustain as a result of > software viruses which may be transmitted in or with the message. > Diamond Light Source Limited (company no. 4375679). Registered in England and > Wales with its registered office at Diamond House, Harwell Science and > Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom > > > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users