- Set SGE_ND in the env - At the shell, gdb sge_qmaster , and then "r".
Rayson On Fri, Aug 31, 2012 at 3:33 PM, Bob Tupper <[email protected]> wrote: > Can you please explain in more detail how to launch with the debugger > enabled? > > Thanks > > > On 08/31/2012 12:25 PM, Rayson Ho wrote: >> >> Two things you can try: >> >> 1) Run the qmaster under a debugger by setting $SGE_ND, and send us a >> backtrace of the crash. >> >> 2) Try the qmaster binary in a newer release (you don't need to >> upgrade other parts of your cluster, and don't need to drain the >> jobs), and if it really is the job report issue, then the newer >> qmaster should be able to handle the job reports without crashing: >> >> http://dl.dropbox.com/u/47200624/respin/ge2011.11.tar.gz >> >> Of course, you can compile from source if you want: >> http://gridscheduler.sourceforge.net/ >> >> Rayson >> >> >> >> On Fri, Aug 31, 2012 at 3:20 PM, Bob Tupper <[email protected]> wrote: >>> >>> Thanks for your help. >>> I do have PE defined. But it crashes with just a simple job that just >>> sleeps. >>> Crashes every time. >>> -Bob >>> >>> >>> >>> On 08/31/2012 11:59 AM, Rayson Ho wrote: >>>> >>>> Do you have parallel (or PE) jobs in your cluster?? A bug in SGE 6.2u5 >>>> can cause the qmaster to seg fault when it receives the job reports >>>> from parallel jobs. >>>> >>>> Rayson >>>> >>>> >>>> >>>> On Fri, Aug 31, 2012 at 2:52 PM, Bob Tupper <[email protected]> >>>> wrote: >>>>> >>>>> Greetings, >>>>> >>>>> Hope someone can help me out. >>>>> I have a 6.2u5 install on centos 5.x >>>>> >>>>> Last night the power company shut us down. >>>>> This morning I can not get sge_master daemon to say running. >>>>> If i disable all the queues or shutdown all the executable host daemons >>>>> so >>>>> jobs can not run, it will stay up. >>>>> >>>>> As soon as i enable and a job attempts to run, the sge_master daemon >>>>> crashes. Sometimes the job sends an email error, often not, but it >>>>> always >>>>> segfaults. >>>>> >>>>> I restored from backup and have the same problem. >>>>> >>>>> I have a shadow master and it crashes on both the main and backup >>>>> masters. >>>>> >>>>> Im at a loss. Any help would be most appreciated. >>>>> >>>>> Thanks >>>>> -Bob >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://gridengine.org/mailman/listinfo/users >>> >>> > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
