Hi Alexandre, sorry for not coming back to you earlier, but I missed the time for a more in-depth look into this matter.
On 4/20/19 03:00, Alexandre Bencz wrote:
Frank, what PCI cards you 9111-520 has ?
I don't have additional cards in my 9111-520. In the meantime I could confirm that my 9131-52A is also affected by this issue. This machine also has no additional PCI cards installed. As this one has more processing power I wanted to debug on this system instead of the 9111-520. But the battery of its SP was depleted and it took me some time to get it back to a usable state (i.e. remote reset possible via ASMI web interface to safely recover from those hangs). To be clear, from what I saw, this problem will hit you in any case, with or without additional PCI cards, because it is triggered by the loading/operation of the `ipr` module and if not blacklisted, this will always load in these machines due to the built-in SCSI controller ("IBM Power Linux RAID SCSI HBA" according to the driver).
It have raid controller too ?
I believe the built-in controller (used with the `ipr` driver) is RAID capable, though I don't use those in RAID mode. Actually I don't use them at all, as my systems most of the time boot from network. Nevertheless, this issue also hangs my machines when booting from the network. A safe workaround here again is to blacklist the `ipr` module. But this is of course not suitable when the intention is to boot from disk. **** From my testing with the Debian kernel (4.19.0-4-powerpc64), not blacklisting the module during kernel boot always triggers the hangs. Loading the module after the system has come up seems to succeed, if some time (15 minutes or so) has passed since boot, but not shortly after the system has fully booted into the OS. Removing and reloading the module afterwards might also hang the machine. So you could try to boot with the `ipr` module blacklisted (modprobe.blacklist=ipr) and set a break at the premount state (`break=premount`), which will drop you to a shell in the initramfs. There wait for some time and try to load the `ipr` module and if that succeeds, exit the shell to continue the boot process. **** I now also compiled a 5.1.0-rc7 kernel and with that cannot reproduce the hangs I experienced with the Debian kernel (neither during kernel boot nor when the system is fully booted (network booted both times!)) on the 9131-52A. The "initialization" of the `ipr` driver can take up to 40 seconds, so be patient. I haven't yet tested this on my 9111-520 or 9111-285 but expect the issue to be gone there, too. So, similar to the problems of last year (see [1]), the issue seems to have resolved itself with a later kernel version. [1]: https://lists.debian.org/debian-powerpc/2018/10/msg00002.html **** In addition to that I also tested with the current kernel in the experimental suite (5.0.0-trunk-powerpc64) Not sure if we should invest more time in this then. Just install the kernel from the experimental suite and wait for 5.0.x to reach unstable. Can you confirm that this works for you, too? Cheers, Frank