Hallo to distinguished forum members,

Recently we have found that something is wrong with SGE Job IDs - they are 
getting reset very fast: 6-7 times in a month.
We don't really have so many jobs executed in such a short period of time.

We use JobId (via qacct) as a primary key for different home-made analytic 
tools, and this very quick jobId switch impairs the reliability of the tools.

This started after we had a full electricity shutdown during which we have 
halted all our systems including SGE master/shadow and its execution hosts.

Perhaps something sets $SGE_ROOT/default/spool/qmaster/jobseqnum to "9999999" 
and then something (related or not) restarts SGE setting that jobid.

Any tips and advices where to look for the root cause, will be greatly 
appreciated.
Thank You.



Yuri Burmachenko | Sr. Engineer | IT | Mellanox Technologies Ltd.
Work: +972 74 7236386 | Cell +972 54 7542188 |Fax: +972 4 959 3245
Follow us on Twitter<http://twitter.com/mellanoxtech> and 
Facebook<http://www.facebook.com/pages/Mellanox-Technologies/223164879116>

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to