On Wed, Jan 18, 2023, at 1:18 PM, Bruno Wolff III wrote:
> On Tue, Jan 17, 2023 at 18:37:20 -0600,
>   Bruno Wolff III <br...@wolff.to> wrote:
>>Today one of three machines failed to boot 6.2.0-0.rc4. The machine 
>>that failed had not been rebooted in a week and now won't boot any 
>>kernel and it appears grub is aborting with a pointer out of range 
>>error. All three machines use ext4 and luks, but only the failing 
>>machine uses mdraid. I haven't recovered the failing machine yet, but 
>>plan to downgrade grub tomorrow and hope that confirms a grub bug by 
>>allowing it to boot. If so, I'll file a bug report.
>
> Running grub2-install in a live image fixed this. 

Is the system firmware BIOS or UEFI? Either way it's kinda confusing...

BIOS GRUB does not update the embedded core.img (in the MBR gap or GPT BIOS 
BOOT partition), so when the RPM version changes, the embedded GRUB doesn't 
change, nor do any of the modules in /boot/grub2. So a grub bug trigging boot 
failure on BIOS is ... not expected.

UEFI GRUB does update the grubx64.efi OSLoader found on the EFI System 
partition. So a grub bug could trigger boot failure here. But I'm surprised 
grub2-install even executes, due to well known issues making it not recommended 
as the resulting EFI file ends up having rather different behaviors than the 
EFI file produced in Fedora infra and included in the GRUB RPM. 

So either way it's kind of a weird result... I will speculate -> if UEFI,  
problem could be /boot/grub2 contained some older version of modules that your 
specific use case requires (the typical case in Fedora, modules aren't 
installed in either /usr or the EFI system partition on UEFI systems - 
everything needed is baked into the pre-built grubx64.efi OSLoader). Upon 
updating GRUB RPM, a new grubx64.efi was installed, but did not update the 
modules in /boot/grub2, this could cause a problem that would be resolved by 
grub2-install because if that command does execute, it results in version 
parity for grubx64.efi and the modules in /boot/grub2. But again, on UEFI 
grub2-install really should not be used. It's not a great situation but there 
is no agreement right now between distros using Secure Boot and upstream GRUB 
exactly how to handle grub-install any differently than it is right now.

I pretty much see two options. If you want to use Fedora's grub package, you 
should "reset" it by following these instructions:
https://fedoraproject.org/wiki/GRUB_2#Instructions_for_UEFI-based_systems

If you have a special use case for GRUB that Fedora's doesn't meet: (a) I'd 
file a RFE bug in RHBZ and explain that use case, so the bootloader team is at 
least aware of it. And (b) I'd build GRUB from upstream source, and then you 
can use grub-install as expected. (It won't work out of the box with Secure 
Boot enabled and Fedora's shim, but I assume you're not using Secure Boot or 
else grub2-install wouldn't have fixed the problem.)

But after writing all that, maybe UEFI doesn't apply to your use case :D


>I'm not sure if this 
> is 
> a bug or if people are expected to do this themselves (before 
> rebooting) 

On BIOS, yes, it's expected to require manual user intervention to update the 
embedded GRUB binary and its modules found on the /boot volume. 


> because there are issues trying to automate this for legacy systems. It 
> might be that the scripts to do updates don't pull the correct devices 
> when /boot is on raid. It is very annoying though when this happens.

I don't know if https://github.com/coreos/bootupd is doing this automatically 
on BIOS firmware systems now or is planning on doing it? At one time they were. 
I think the thought process from this point forward is focusing on UEFI Secure 
Boot workflows rather than inherently insecurable BIOS scenarios, and that the 
ship has just sailed.



-- 
Chris Murphy
_______________________________________________
test mailing list -- test@lists.fedoraproject.org
To unsubscribe send an email to test-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/test@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to