Re: Antw: [EXT] [systemd-devel] [SPECIFICATION RFC] The firmware and bootloader log specification
On 16/11/2020 08.02, Ulrich Windl wrote: Daniel Kiper schrieb am 14.11.2020 um 00:52 in > Nachricht <20201113235242.k6fzlwmwm2xqh...@tomti.i.net-space.pl>: > ... >> The members of struct bf_log_msg: >> ‑ size: total size of bf_log_msg struct, >> ‑ ts_nsec: timestamp expressed in nanoseconds starting from 0, > > Who or what defines t == 0? Some sort of "clapperboard" log entry, stating "the RTC says X, the cycle counter is Y, the onboard ACME atomic clock says Z, I'm now starting to count ts_nsec from W" might be useful for some eventual userspace tool to try to stitch together the log entries from the various stages. I have no idea how a formal spec of such an entry would look like or if it's even feasible to do formally. But even just such entries in free-form prose could at least help a human consumer. Rasmus
Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).
On 17/06/2021 17.01, Linus Torvalds wrote: > On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom > wrote: >> >> I just tried to upgrade and test the linux kernel going from the 5.12 kernel >> series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble. >> >> Some VM's boot fine (with more than 256MB memory assigned), but the smaller >> (memory wise) PVH ones crash during kernel boot due to OOM. >> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is >> running 5.13-rc6 (but it has more memory assigned, so that is not >> unexpected). > > Adding Rasmus to the cc, because this looks kind of like the async > roofs population thing that caused some other oom issues too. Yes, that looks like the same issue. > Rasmus? Original report here: > > > https://lore.kernel.org/lkml/ee8bf04c-6e55-1d9b-7bdb-25e6108e8...@eikelenboom.it/ > > I do find it odd that we'd be running out of memory so early.. Indeed. It would be nice to know if these also reproduce with initramfs_async=0 on the command line. But what is even more curious is that in the other report (https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/), it seemed to trigger with _more_ memory - though I may be misreading what Oliver was telling me: > please be noted that we use 'vmalloc=512M' for both parent and this commit. > since it's ok on parent but oom on this commit, we want to send this report > to show the potential problem of the commit on some cases. > > we also tested by changing to use 'vmalloc=128M', it will succeed. Those tests were done in a VM with 16G memory, and then he also wrote > we also tried to follow exactly above steps to test on > some local machine (8G memory), but cannot reproduce. Are there some special rules for what memory pools PID1 versus the kworker threads can dip into? Side note: I also had a ppc64 report with different symptoms (the initramfs was corrupted), but that turned out to also reproduce with e7cb072eb98 reverted, so that is likely unrelated. But just FTR that thread is here: https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=mnuvqrfan7w1i6...@mail.gmail.com/ Rasmus
Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).
On 17/06/2021 17.01, Linus Torvalds wrote: > On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom > wrote: >> >> I just tried to upgrade and test the linux kernel going from the 5.12 kernel >> series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble. >> >> Some VM's boot fine (with more than 256MB memory assigned), but the smaller >> (memory wise) PVH ones crash during kernel boot due to OOM. >> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is >> running 5.13-rc6 (but it has more memory assigned, so that is not >> unexpected). > > Adding Rasmus to the cc, because this looks kind of like the async > roofs population thing that caused some other oom issues too. Yes, that looks like the same issue. > Rasmus? Original report here: > > > https://lore.kernel.org/lkml/ee8bf04c-6e55-1d9b-7bdb-25e6108e8...@eikelenboom.it/ > > I do find it odd that we'd be running out of memory so early.. Indeed. It would be nice to know if these also reproduce with initramfs_async=0 on the command line. But what is even more curious is that in the other report (https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/), it seemed to trigger with _more_ memory - though I may be misreading what Oliver was telling me: > please be noted that we use 'vmalloc=512M' for both parent and this commit. > since it's ok on parent but oom on this commit, we want to send this report > to show the potential problem of the commit on some cases. > > we also tested by changing to use 'vmalloc=128M', it will succeed. Those tests were done in a VM with 16G memory, and then he also wrote > we also tried to follow exactly above steps to test on > some local machine (8G memory), but cannot reproduce. Are there some special rules for what memory pools PID1 versus the kworker threads can dip into? Side note: I also had a ppc64 report with different symptoms (the initramfs was corrupted), but that turned out to also reproduce with e7cb072eb98 reverted, so that is likely unrelated. But just FTR that thread is here: https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=mnuvqrfan7w1i6...@mail.gmail.com/ Rasmus
Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).
On 18/06/2021 03.06, Sander Eikelenboom wrote: > On 17/06/2021 21:39, Sander Eikelenboom wrote: > > OK, done some experimentation and it seems with 256M assigned to the VM > it was almost at the edge of OOM with the 5.12 kernel as well in the > config I am using it. > With v5.12 when I assign 240M it boots, with 230M it doesn't. With 5.13 > the tipping point seems to be around 265M and 270M, so my config was > already quite close to the edge. > > The "direct kernel boot" feature I'm using just seems somewhat memory > hungry, but using another compression algorithm for the kernel and > initramfs already helped in my case. > > So sorry for the noise, clearly user-error. Hm, perhaps, but I'm still a bit nervous about that report from Oliver Sang/kernel test robot, which was for a VM equipped with 16G of memory. But despite quite a few attempts, I haven't been able to reproduce that locally, so unfortunately I have no idea what's going on. Rasmus
Re: [PATCH v3 2/4] vsscanf(): Integer overflow is a conversion failure
On 10/06/2023 22.40, Demi Marie Obenour wrote: > sscanf() and friends currently ignore integer overflow, but this is a > bad idea. It is much better to detect integer overflow errors and > consider this a conversion failure. Perhaps. And maybe I even agree. But not like this: > while (*fmt) { > /* skip any white space in format */ > @@ -3464,6 +3474,9 @@ int vsscanf(const char *buf, const char *fmt, va_list > args) > break; > ++fmt; > > + allow_overflow = *fmt == '!'; > + fmt += (int)allow_overflow; > + You can't do that. Or, at least, you won't be able to actually use %!d anywhere, because the compiler will yell at you: lib/vsprintf.c: In function ‘foobar’: lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in format [-Werror=format=] 3727 | ret = sscanf("12345", "%!d", &val); | ^ So NAK. Also, when you make significant changes to the sscanf implementation, I'd expect the diffstat for the patch series to contain lib/test_scanf.c. Rasmus
Re: [PATCH v3 3/4] vsscanf(): do not skip spaces
On 10/06/2023 22.40, Demi Marie Obenour wrote: > Passing spaces before e.g. an integer is usually > not intended. Maybe, maybe not. But it's mandated by POSIX/C99. And of course we are free to ignore that and implement our own semantics (though within the constraints that we really want -Wformat to help us), but there seems to be existing code in-tree that relies on this behavior. For example I think this will break fsl_sata_intr_coalescing_store() which uses a scanf format of "%u%u". Sure, that could just say "%u %u" instead, but the point is that currently it doesn't. So without some reasonably thorough analysis across the tree, and updates of affected callers, NAK. Rasmus