On 05.07.2018 19:25, Jason J. Herne wrote: > From: "Jason J. Herne" <jjhe...@linux.vnet.ibm.com> > > Allows guest to boot from a vfio configured real dasd device. > > Signed-off-by: Jason J. Herne <jjhe...@linux.vnet.ibm.com> > Signed-off-by: Jason J. Herne <jjhe...@linux.ibm.com> > --- > docs/devel/s390-dasd-ipl.txt | 132 +++++++++++++++++++++++ > pc-bios/s390-ccw/Makefile | 2 +- > pc-bios/s390-ccw/dasd-ipl.c | 249 > +++++++++++++++++++++++++++++++++++++++++++ > pc-bios/s390-ccw/dasd-ipl.h | 16 +++ > pc-bios/s390-ccw/main.c | 4 + > pc-bios/s390-ccw/s390-arch.h | 13 +++ > 6 files changed, 415 insertions(+), 1 deletion(-) > create mode 100644 docs/devel/s390-dasd-ipl.txt > create mode 100644 pc-bios/s390-ccw/dasd-ipl.c > create mode 100644 pc-bios/s390-ccw/dasd-ipl.h > > diff --git a/docs/devel/s390-dasd-ipl.txt b/docs/devel/s390-dasd-ipl.txt > new file mode 100644 > index 0000000..87aecb9 > --- /dev/null > +++ b/docs/devel/s390-dasd-ipl.txt > @@ -0,0 +1,132 @@ > +***************************** > +***** s390 hardware IPL ***** > +***************************** > + > +The s390 hardware IPL process consists of the following steps. > + > +1. A READ IPL ccw is constructed in memory location 0x0. > + This ccw, by definition, reads the IPL1 record which is located on the > disk > + at cylinder 0 track 0 record 1. Note that the chain flag is on in this > ccw > + so when it is complete another ccw will be fetched and executed from > memory > + location 0x08. > + > +2. Execute the Read IPL ccw at 0x00, thereby reading IPL1 data into 0x00. > + IPL1 data is 24 bytes in length and consists of the following pieces of > + information: [psw][read ccw][tic ccw]. When the machine executes the Read > + IPL ccw it read the 24-bytes of IPL1 to be read into memory starting at > + location 0x0. Then the ccw program at 0x08 which consists of a read > + ccw and a tic ccw is automatically executed because of the chain flag > from > + the original READ IPL ccw. The read ccw will read the IPL2 data into > memory > + and the TIC (Tranfer In Channel) will transfer control to the channel > + program contained in the IPL2 data. The TIC channel command is the > + equivalent of a branch/jump/goto instruction for channel programs. > + NOTE: The ccws in IPL1 are defined by the architecture to be format 0. > + > +3. Execute IPL2. > + The TIC ccw instruction at the end of the IPL1 channel program will begin > + the execution of the IPL2 channel program. IPL2 is stage-2 of the boot > + process and will contain a larger channel program than IPL1. The point of > + IPL2 is to find and load either the operating system or a small program > that > + loads the operating system from disk. At the end of this step all or > some of > + the real operating system is loaded into memory and we are ready to hand > + control over to the guest operating system. At this point the guest > + operating system is entirely responsible for loading any more data it > might > + need to function. NOTE: The IPL2 channel program might read data into > memory > + location 0 thereby overwriting the IPL1 psw and channel program. This is > ok > + as long as the data placed in location 0 contains a psw whose instruction > + address points to the guest operating system code to execute at the end > of > + the IPL/boot process. > + NOTE: The ccws in IPL2 are defined by the architecture to be format 0. > + > +4. Start executing the guest operating system. > + The psw that was loaded into memory location 0 as part of the ipl process > + should contain the needed flags for the operating system we have loaded. > The > + psw's instruction address will point to the location in memory where we > want > + to start executing the operating system. This psw is loaded (via LPSW > + instruction) causing control to be passed to the operating system code. > + > +In a non-virtualized environment this process, handled entirely by the > hardware, > +is kicked off by the user initiating a "Load" procedure from the hardware > +management console. This "Load" procedure crafts a special "Read IPL" ccw in > +memory location 0x0 that reads IPL1. It then executes this ccw thereby > kicking > +off the reading of IPL1 data. Since the channel program from IPL1 will be > +written immediately after the special "Read IPL" ccw, the IPL1 channel > program > +will be executed immediately (the special read ccw has the chaining bit > turned > +on). The TIC at the end of the IPL1 channel program will cause the IPL2 > channel > +program to be executed automatically. After this sequence completes the > "Load" > +procedure then loads the psw from 0x0. > + > +***************************************** > +***** How this all pertains to Qemu ***** > +***************************************** > + > +In theory we should merely have to do the following to IPL/boot a guest > +operating system from a DASD device: > + > +1. Place a "Read IPL" ccw into memory location 0x0 with chaining bit on. > +2. Execute channel program at 0x0. > +3. LPSW 0x0. > + > +However, our emulation of the machine's channel program logic is missing one > key > +feature that is required for this process to work: non-prefetch of ccw data. > + > +When we start a channel program we pass the channel subsystem parameters via > an > +ORB (Operation Request Block). One of those parameters is a prefetch bit. If > the > +bit is on then Qemu is allowed to read the entire channel program from guest > +memory before it starts executing it. This means that any channel commands > that > +read additional channel commands will not work as expected because the newly > +read commands will only exist in guest memory and NOT within Qemu's channel > +subsystem memory. Qemu's channel subsystem's implementation currently > requires > +this bit to be on for all channel programs. This is a problem because the IPL > +process consists of transferring control from the "Read IPL" ccw immediately > to > +the IPL1 channel program that was read by "Read IPL". > ++
I have way too little insight into channel devices and how QEMU implements them, however I wonder what hinders us from implementing support for !prefetch in QEMU? What you tailored here seems impressive :) Just want to know what the technical background of this prefetch thingy in QEMU is. -- Thanks, David / dhildenb