We had a user report that one of their array jobs wasn't scheduling A
bit of poking around showed that qconf -suser knew nothing of the user
despite them having a queued job.  However there was a file in the spool
that should have defined the user.  Several other users appear to be
affected as well.

I bounced the qmaster in the hopes of getting it to reread the users'
details from disk.  And got several messages like this:

04/16/2018 11:06:53| main|util01|E|error reading file: 
"/var/opt/sge/shared/qmaster/users/zccag81" 
04/16/2018 11:06:53| main|util01|E|unrecognized characters after the attribute 
values in line 12: "mem" 
04/16/2018 11:06:53| main|util01|E|line 12 should begin with an attribute name

I suspect that my next step should be to stop the qmaster, delete the
problem files and then restart the qmaster.  Hopefully grid engine will
then recreate the user or I can create them manually.

However if anyone has a better idea or has seen this before I'd be glad
to hear of it.

Creation of the user object on our cluster is done by means of enforce_user 
auto:
#qconf -sconf |grep auto
enforce_user                 auto
auto_user_oticket            0
auto_user_fshare             1
auto_user_default_project    none
auto_user_delete_time        0


William

Attachment: signature.asc
Description: PGP signature

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to