Re: Antw: [EXT] [systemd-devel] [SPECIFICATION RFC] The firmware and bootloader log specification

2020-11-16 Thread Rasmus Villemoes
On 16/11/2020 08.02, Ulrich Windl wrote:
 Daniel Kiper  schrieb am 14.11.2020 um 00:52 in
> Nachricht <20201113235242.k6fzlwmwm2xqh...@tomti.i.net-space.pl>:
> ...
>> The members of struct bf_log_msg:
>>   ‑ size: total size of bf_log_msg struct,
>>   ‑ ts_nsec: timestamp expressed in nanoseconds starting from 0,
> 
> Who or what defines t == 0?

Some sort of "clapperboard" log entry, stating "the RTC says X, the
cycle counter is Y, the onboard ACME atomic clock says Z, I'm now
starting to count ts_nsec from W" might be useful for some eventual
userspace tool to try to stitch together the log entries from the
various stages. I have no idea how a formal spec of such an entry would
look like or if it's even feasible to do formally. But even just such
entries in free-form prose could at least help a human consumer.

Rasmus



Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

2021-06-17 Thread Rasmus Villemoes
On 17/06/2021 17.01, Linus Torvalds wrote:
> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom  
> wrote:
>>
>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel 
>> series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>
>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller 
>> (memory wise) PVH ones crash during kernel boot due to OOM.
>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is 
>> running 5.13-rc6 (but it has more memory assigned, so that is not 
>> unexpected).
> 
> Adding Rasmus to the cc, because this looks kind of like the async
> roofs population thing that caused some other oom issues too.

Yes, that looks like the same issue.

> Rasmus? Original report here:
> 
>
> https://lore.kernel.org/lkml/ee8bf04c-6e55-1d9b-7bdb-25e6108e8...@eikelenboom.it/
> 
> I do find it odd that we'd be running out of memory so early..

Indeed. It would be nice to know if these also reproduce with
initramfs_async=0 on the command line.

But what is even more curious is that in the other report
(https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
it seemed to trigger with _more_ memory - though I may be misreading
what Oliver was telling me:

> please be noted that we use 'vmalloc=512M' for both parent and this
commit.
> since it's ok on parent but oom on this commit, we want to send this
report
> to show the potential problem of the commit on some cases.
>
> we also tested by changing to use 'vmalloc=128M', it will succeed.

Those tests were done in a VM with 16G memory, and then he also wrote

> we also tried to follow exactly above steps to test on
> some local machine (8G memory), but cannot reproduce.

Are there some special rules for what memory pools PID1 versus the
kworker threads can dip into?


Side note: I also had a ppc64 report with different symptoms (the
initramfs was corrupted), but that turned out to also reproduce with
e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
thread is here:
https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=mnuvqrfan7w1i6...@mail.gmail.com/

Rasmus



Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

2021-06-17 Thread Rasmus Villemoes
On 17/06/2021 17.01, Linus Torvalds wrote:
> On Thu, Jun 17, 2021 at 2:26 AM Sander Eikelenboom  
> wrote:
>>
>> I just tried to upgrade and test the linux kernel going from the 5.12 kernel 
>> series to 5.13-rc6 on my homeserver with Xen, but ran in some trouble.
>>
>> Some VM's boot fine (with more than 256MB memory assigned), but the smaller 
>> (memory wise) PVH ones crash during kernel boot due to OOM.
>> Booting VM's with 5.12(.9) kernel still works fine, also when dom0 is 
>> running 5.13-rc6 (but it has more memory assigned, so that is not 
>> unexpected).
> 
> Adding Rasmus to the cc, because this looks kind of like the async
> roofs population thing that caused some other oom issues too.

Yes, that looks like the same issue.

> Rasmus? Original report here:
> 
>
> https://lore.kernel.org/lkml/ee8bf04c-6e55-1d9b-7bdb-25e6108e8...@eikelenboom.it/
> 
> I do find it odd that we'd be running out of memory so early..

Indeed. It would be nice to know if these also reproduce with
initramfs_async=0 on the command line.

But what is even more curious is that in the other report
(https://lore.kernel.org/lkml/20210607144419.GA23706@xsang-OptiPlex-9020/),
it seemed to trigger with _more_ memory - though I may be misreading
what Oliver was telling me:

> please be noted that we use 'vmalloc=512M' for both parent and this
commit.
> since it's ok on parent but oom on this commit, we want to send this
report
> to show the potential problem of the commit on some cases.
>
> we also tested by changing to use 'vmalloc=128M', it will succeed.

Those tests were done in a VM with 16G memory, and then he also wrote

> we also tried to follow exactly above steps to test on
> some local machine (8G memory), but cannot reproduce.

Are there some special rules for what memory pools PID1 versus the
kworker threads can dip into?


Side note: I also had a ppc64 report with different symptoms (the
initramfs was corrupted), but that turned out to also reproduce with
e7cb072eb98 reverted, so that is likely unrelated. But just FTR that
thread is here:
https://lore.kernel.org/lkml/CA+QYu4qxf2CYe2gC6EYnOHXPKS-+cEXL=mnuvqrfan7w1i6...@mail.gmail.com/

Rasmus



Re: Linux 5.13-rc6 regression to 5.12.x: kernel OOM and panic during kernel boot in low memory Xen VM's (256MB assigned memory).

2021-06-21 Thread Rasmus Villemoes
On 18/06/2021 03.06, Sander Eikelenboom wrote:
> On 17/06/2021 21:39, Sander Eikelenboom wrote:

> 
> OK, done some experimentation and it seems with 256M assigned to the VM
> it was almost at the edge of OOM with the 5.12 kernel as well in the
> config I am using it.
> With v5.12 when I assign 240M it boots, with 230M it doesn't. With 5.13
> the tipping point seems to be around 265M and 270M, so my config was
> already quite close to the edge.
> 
> The "direct kernel boot" feature I'm using just seems somewhat memory
> hungry, but using another compression algorithm for the kernel and
> initramfs already helped in my case.
> 
> So sorry for the noise, clearly user-error.

Hm, perhaps, but I'm still a bit nervous about that report from Oliver
Sang/kernel test robot, which was for a VM equipped with 16G of memory.
But despite quite a few attempts, I haven't been able to reproduce that
locally, so unfortunately I have no idea what's going on.

Rasmus



Re: [PATCH v3 2/4] vsscanf(): Integer overflow is a conversion failure

2023-06-12 Thread Rasmus Villemoes
On 10/06/2023 22.40, Demi Marie Obenour wrote:
> sscanf() and friends currently ignore integer overflow, but this is a
> bad idea.  It is much better to detect integer overflow errors and
> consider this a conversion failure.

Perhaps. And maybe I even agree. But not like this:

>   while (*fmt) {
>   /* skip any white space in format */
> @@ -3464,6 +3474,9 @@ int vsscanf(const char *buf, const char *fmt, va_list 
> args)
>   break;
>   ++fmt;
>  
> + allow_overflow = *fmt == '!';
> + fmt += (int)allow_overflow;
> +

You can't do that. Or, at least, you won't be able to actually use %!d
anywhere, because the compiler will yell at you:

lib/vsprintf.c: In function ‘foobar’:
lib/vsprintf.c:3727:26: error: unknown conversion type character ‘!’ in
format [-Werror=format=]
 3727 |  ret = sscanf("12345", "%!d", &val);
  |  ^

So NAK.

Also, when you make significant changes to the sscanf implementation,
I'd expect the diffstat for the patch series to contain lib/test_scanf.c.

Rasmus



Re: [PATCH v3 3/4] vsscanf(): do not skip spaces

2023-06-12 Thread Rasmus Villemoes
On 10/06/2023 22.40, Demi Marie Obenour wrote:
> Passing spaces before e.g. an integer is usually
> not intended. 

Maybe, maybe not. But it's mandated by POSIX/C99.

And of course we are free to ignore that and implement our own semantics
(though within the constraints that we really want -Wformat to help us),
but there seems to be existing code in-tree that relies on this
behavior. For example I think this will break
fsl_sata_intr_coalescing_store() which uses a scanf format of "%u%u".

Sure, that could just say "%u %u" instead, but the point is that
currently it doesn't. So without some reasonably thorough analysis
across the tree, and updates of affected callers, NAK.

Rasmus