Hallo to distinguished forum members, Recently we have found that something is wrong with SGE Job IDs - they are getting reset very fast: 6-7 times in a month. We don't really have so many jobs executed in such a short period of time.
We use JobId (via qacct) as a primary key for different home-made analytic tools, and this very quick jobId switch impairs the reliability of the tools. This started after we had a full electricity shutdown during which we have halted all our systems including SGE master/shadow and its execution hosts. Perhaps something sets $SGE_ROOT/default/spool/qmaster/jobseqnum to "9999999" and then something (related or not) restarts SGE setting that jobid. Any tips and advices where to look for the root cause, will be greatly appreciated. Thank You. Yuri Burmachenko | Sr. Engineer | IT | Mellanox Technologies Ltd. Work: +972 74 7236386 | Cell +972 54 7542188 |Fax: +972 4 959 3245 Follow us on Twitter<http://twitter.com/mellanoxtech> and Facebook<http://www.facebook.com/pages/Mellanox-Technologies/223164879116>
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users