On Thu, Oct 26, 2006 at 10:25:37AM -0400, Matthew Krauss wrote: > Hi, I haven't been following this to closely so I may be missing > something, but this message caught my eye. I'm not sure how experienced > you are, so I will try to be very explicit -- if I tell you things you > think are obvious, please forgive me.
Thanks. No forgiveness needed. If anything, I appreciate this level of detail. I've found that, when explaining things to others, it's hard to guess the proper level of detail. > > [EMAIL PROTECTED] wrote: > >On Wed, Oct 25, 2006 at 12:34:02PM -0700, Andrew Sackville-West wrote: > > > >>[EMAIL PROTECTED] wrote: > >> > >> > >>>Four more reboots, one successful. > >>>It seems to ba a problem starting gdm. > >>> > Hmm... It sounds like a race condition, obviously. I *thought* that was a possibility. > From what I have > read in this thread, I would guess that there is a very good probability > that you have an old startup script laying around from a package that > has been otherwise removed or upgraded. Is there any way to search for such stray files? There were dome bugs in upgrade scripts a few months when X underwent two revolutions in a row. > >>it could be an X problem or a gdm problem, but probably I'd guess X. > >> > >>>It tell sme it's starting gdm, > >>>then that it'snot starting kdm because it's not the default, > >>>then that it's not starting (presumably another *dm) because it's > >>>not the default > >>> > >>thats normal. X does sanity checks to make sure you're not starting more > >>than one session manager or whatever. > >> > >I know that. I just thought that the last message before the crash > >might be a clue to what went wrong -- such as an unfortunate race > >condition between gdm and whatever thing decides not to start the other. > >But I admit this is unlikely. I thought is unlikely because, as far as I know, these *dm startup scripts check whether they are default *before* they start anything up. > > > > > >>>then the black screen of death, preventing me from reading which other > >>>*dm it was considering. > >>> > >>are you locked up hard at that point or can you switch to a vt? > >>ctrl-alt-fx? > >> > > > >Locked up hard. THough I suppose I should try ssh-ing in. > > > When you say "black screen of death" I assume you mean a kernel panic? The screen goes completely black. No text visible. If I recall correctly, a kernel panic usually puts a kernel panic message on the bottom of the screen. But of course, perhaps it's not displaying the kernel logging screen when it dies. > If so, ssh-ing won't work. Therefore worth a try. Give us a further clue whether it might be a kernel panic. > Also, notably, a kernel panic should *never* > happen (theoretically!) -- it is always the result of either a kernel > bug or a hardware failure. No user-space program should be able to > cause a kernel panic. > > What I would try to isolate the problem is: > > 1. Reboot in to single user mode. Which I do by specifying "etch 1" at the lilo boot promot. It works. The on-screen messages call it maintenance mode, though. I presume that's the same mode. > 2. Log in as root. That works. Will do the rest later in the day when my users are gone. > 3. Try starting X alone: > $ X 2>&1 | less > 3a. If X starts, you may kill it with ctrl-alt-backspace; > 3b. If X does not start, you have the output to debug; > 3c. If you get a kernel panic, you know you have serious X problems. > 4. Next try starting gdm directly: > $ /etc/init.d/gdm start > 4a. If gdm starts, there is probably a problem in your startup scripts; > 4b. If gdm does not start, you can check the logs under /var/log/gdm/ > 4c. If you get a kernel panic, you know you have serious gdm problems. > > In the case of 4a., where you have a problem in your startup scripts: > > 5. Kill gdm -- use ctrl-alt-F1 to return to your terminal, and issue: > $ /etc/init.d/gdm stop > 6. Switch to the default runlevels rc directory and ls it: > $ cd /etc/rc2.d > $ ls > See all the links named S##* > .. where ## is a number > .. and * is the rest of the name? > At startup, these are all started in the order of the ## numbers. > Scripts with the same number as gdm start at the same time. > These are good candidates for a race condition. > For instance, I have: > S99gdm > S99rc.local > S99rmnologin > S99stop-bootlogd > You probably have all of these, plus: > S99xdm > S99kdm > .. and others? > 7. Try starting up the scripts with the same number as gdm in various > orders. Consider which ones sound likely to be the problem. For > instance, you have guessed that another *dm is your problem, so try > starting first xdm and then gdm, then the other way around. If you make > a crash, congratulations! > > Oh, to start a script, ie. S99gdm, use: > $ ./S99gdm start > > S99rc.local actually runs /etc/rc.local which might have anything in it, > so that is worth looking in to. You should probably look at > /etc/rc.local and see what it is doing. > > Scripts with other numbers are possible too -- just less likely -- so > you may want to try them if you don't find the problem in the "good > candidates" first. > > Hopefully helpful, I think it will be, when I get the machine to myself again. > > Matthew > > > > >>>Could it be that the *dm is interfering with gdm starting up? > >>>Maybe it's whatever it does *after* trying its hand with the *dm'a > >>> that is the culprit? Anyone know what that is? > >>>Should I try making another *dm the default? > >>>Should I try purging the other *dm's? > >>>Should I try purging gdm? > >>>Should I try running a general update of everything just in case? > >>> > >>> > >>as Andre said, /etc/init.d/gdm stop. > >> > >>then I'd get rid of the links for the moment so you can actually work on > >>the thing: update-rc.d gdm -f remove && update-rc.d kdm -f remove and so > >>forth. Then you can use startx as a user and see what happens. > >> > > > >Might be easier just to do this in maintenance mode, which doesn't start > >the things in the first place. > > > >There's a point -- in the two-Debian philosophy of system maintanance, > >use there any way of using, say, aptitude running on one system to > >install, uninstall, configure and so forth the other? > >It suddenly struck me as potentially useful. Doesn't the installer do > >something like this, starting from a RAMdisk? > > > >-- hendrik > > > > > > > > > -- > To UNSUBSCRIBE, email to [EMAIL PROTECTED] > with a subject of "unsubscribe". Trouble? Contact > [EMAIL PROTECTED] > -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]