On 05/06/2024 16.48, Jared Rossi wrote:
diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index c977a52b50..de3d1f0d5a 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -43,6 +43,7 @@ typedef unsigned long long u64;
#include "iplb.h"
/* start.s */
+extern char _start[];
void disabled_wait(void) __attribute__ ((__noreturn__));
void consume_sclp_int(void);
void consume_io_int(void);
@@ -88,6 +89,11 @@ __attribute__ ((__noreturn__))
static inline void panic(const char *string)
{
sclp_print(string);
+ if (load_next_iplb()) {
+ sclp_print("\nTrying next boot device...");
+ jump_to_IPL_code((long)_start);
+ }
+
disabled_wait();
}
Honestly, I am unsure whether this is a really cool idea or a very ugly
hack ... but I think I tend towards the latter, sorry. Jumping back to the
startup code might cause various problem, e.g. pre-initialized variables
don't get their values reset, causing different behavior when the s390-ccw
bios runs a function a second time this way. Thus this sounds very
fragile. Could we please try to get things cleaned up correctly, so that
functions return with error codes instead of panicking when we can
continue with another boot device? Even if its more work right now, I
think this will be much more maintainable in the future.
Thomas
Thanks Thomas, I appreciate your insight. Your hesitation is perfectly
understandable as well. My initial design was like you suggest, where the
functions return instead of panic, but the issue I ran into is that netboot
uses a separate image, which we jump in to at the start of IPL from a
network device (see zipl_load() in pc-bios/s390-ccw/bootmap.c). I wasn't
able to come up with a simple way to return to the main BIOS code if a
netboot fails other than by jumping back. So, it seems to me that netboot
kind of throws a monkeywrench into the basic idea of reworking the panics
into returns.
I'm open to suggestions on a better way to recover from a failed netboot,
and it's certainly possible I've overlooked something, but as far as I can
tell a jump is necessary in that particular case at least. Netboot could
perhaps be handled as a special case where the jump back is permitted
whereas other device types return, but I don't think that actually solves
the main issue.
What are your thoughts on this?
Yes, I agree that jumping is currently required to get back from the netboot
code. So if you could rework your patches in a way that limits the jumping
to a failed netboot, that would be acceptable, I think.
Apart from that: We originally decided to put the netboot code into a
separate binary since the required roms/SLOF module might not always have
been checked out (it needed to be done manually), so we were not able to
compile it in all cases. But nowadays, this is handled in a much nicer way,
the submodule is automatically checked out once you compile the
s390x-softmmu target and have a s390x compiler available, so I wonder
whether we should maybe do the next step and integrate the netboot code into
the main s390-ccw.img now? Anybody got an opinion on this?
Thomas