We had a user report that one of their array jobs wasn't scheduling A bit of poking around showed that qconf -suser knew nothing of the user despite them having a queued job. However there was a file in the spool that should have defined the user. Several other users appear to be affected as well.
I bounced the qmaster in the hopes of getting it to reread the users' details from disk. And got several messages like this: 04/16/2018 11:06:53| main|util01|E|error reading file: "/var/opt/sge/shared/qmaster/users/zccag81" 04/16/2018 11:06:53| main|util01|E|unrecognized characters after the attribute values in line 12: "mem" 04/16/2018 11:06:53| main|util01|E|line 12 should begin with an attribute name I suspect that my next step should be to stop the qmaster, delete the problem files and then restart the qmaster. Hopefully grid engine will then recreate the user or I can create them manually. However if anyone has a better idea or has seen this before I'd be glad to hear of it. Creation of the user object on our cluster is done by means of enforce_user auto: #qconf -sconf |grep auto enforce_user auto auto_user_oticket 0 auto_user_fshare 1 auto_user_default_project none auto_user_delete_time 0 William
signature.asc
Description: PGP signature
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users