On 2024-07-23 09:17, Jesse Smith wrote:
On 2024-07-11 05:44, Sven Reschke wrote: I have a rented root server, where I have a yocto build running, which utilizes sysvinit. After logging in, the /var/log/boot file is empty and the process is taking up to 95% of cpu usage. Also the process doesn't react on SIGINT. After some debugging, I saw, that the process is trying to open the /dev/ttyS0 device, but this fails everytime with an EIO. There's no counter for a maximal retry of EIO errors, so the process is stuck trying to open the not available serial console... I guess the serial console shouldn't return EIO in the first place (at least not the whole time), but on the other hand, in my humbled opinion, the process should be able to handle this situation. I am attaching an updated bootlogd.c source file which I think should avoid the issue. The original, endless loop doesn't make sense to me. We make one attempt to re-open the file, but if that doesn't work we should probably assume it's not going to become available in the future. So we'll just drop out of the function. I'm open to thoughts or other ways to improve this situation. (Maybe a maximum of 99 attempts with a sleep() call?) Sorry for the late response, vacation time... I tested your new bootlog.c file, but sadly, bootlogd is still trapped in an endless loop :-( My two cents here are, that the re-opening of the device will still succeed everytime, i.e. the call to open_nb(realcons)) in write_err(). If I understand correctly, you're saying the device is opened for writing successfully. However attempts to write to the device always fail, which drops us into the loop?
I did some debugging. I've attached the patch I've applied to the 3.04 of sysvinit bootlogd.c src file. It includes your changes plus additional debug printfs. The process is started with the following command in the init.d script: nohup stdbuf -o0 -e0 $DAEMON -r -c -d > /tmp/bootlogd.log 2>&1 & The log file begins with the following lines, and repeats "forever": TEST TEST TEST TEST1 TEST2 -1 TEST3 /dev/ttyS0 TEST4.2 TEST4.3 TEST1 TEST2 -1 TEST3 /dev/ttyS0 TEST4.2 TEST4.3 TEST1 TEST2 -1 TEST3 /dev/ttyS0 TEST4.2 TEST4.3 TEST1 TEST2 -1 TEST3 /dev/ttyS0 TEST4.2 TEST4.3 So I guess your assumption is correct, that the opening is successful, but writing always fail. P.S. sorry for the spam... I don't get the emails to my mailbox, so I cannot simply reply, but have to manually create the mails. Because of a delay, I feared that the messages were not correctly formated, so I answered 4 times, with slightly different message headers, but in the end, all got delivered -_-
diff -uraN src/bootlogd.c src_new/bootlogd.c --- src/bootlogd.c 2022-04-26 19:37:34.000000000 +0200 +++ src_new/bootlogd.c 2024-07-23 21:35:56.517142060 +0200 @@ -519,16 +519,18 @@ int fd; if (e != EIO) { -werr: + printf("TEST4.1\n"); close(pts); fprintf(stderr, "bootlogd: writing to console: %s\n", strerror(e)); return -1; } + printf("TEST4.2\n"); close(realfd); if ((fd = open_nb(realcons)) < 0) - goto werr; + return -1; + printf("TEST4.3\n"); return fd; } @@ -700,7 +702,7 @@ * to the real console and the logfile. */ while (!got_signal) { - + printf("TEST\n"); /* * We timeout after 5 seconds if we still need to * open the logfile. There might be buffered messages @@ -724,7 +726,9 @@ m = n; p = inptr; while (m > 0) { + printf("TEST1\n"); i = write(cons[considx].fd, p, m); + printf("TEST2 %d\n", i); if (i >= 0) { m -= i; p += i; @@ -734,10 +738,12 @@ * Handle EIO (somebody hung * up our filedescriptor) */ + printf("TEST3 %s\n", cons[considx].name); cons[considx].fd = write_err(pts, cons[considx].fd, cons[considx].name, errno); if (cons[considx].fd >= 0) continue; + printf("TEST4\n"); /* * If this was the last console, * generate a fake signal @@ -759,6 +765,7 @@ inptr = ringbuf; if (outptr >= endptr) outptr = ringbuf; + printf("TEST5\n"); } /* end of got data from read */ } /* end of checking select for new data */ @@ -797,3 +804,4 @@ return 0; } +