Thanks for looking into this patch Andrew, Andrew Donnellan <andrew.donnel...@au1.ibm.com> writes:
> On 08/03/18 21:05, Vaibhav Jain wrote: >> It is possible for a CXL card to have a valid PSL but no valid >> AFUs. When this happens we have a valid instance of 'struct cxl' >> representing the adapter but with its member 'struct cxl_afu *cxl[]' >> as empty. Unfortunately at many placed within cxl code (especially >> during an EEH) the elements of this array are passed on to various >> other cxl functions. Which may result in kernel oops/panic when this >> 'struct cxl_afu *' is dereferenced. >> >> So this patch puts a NULL check at the beginning of various cxl >> functions that accept 'struct cxl_afu *' as a formal argument and are >> called from with a loop of the form: >> >> for (i = 0; i < adapter->slices; i++) { >> afu = adapter->afu[i]; >> /* call some function with 'afu' */ >> } > > Surely in this case adapter->slices should be 0? Not necessarily, as adapter->slice doesnt take into account AFUs that fail to init. I saw this issue in one specific case were the only slice on the card had issued with the AFU descriptor caused CXL init of that AFU to fail. > > We might still need to harden for other cases... Yes we may need some more hardening especially in our AFU descriptor parsing code. -- Vaibhav Jain <vaib...@linux.vnet.ibm.com> Linux Technology Center, IBM India Pvt. Ltd.