On Wed, Jan 21, 2015 at 10:53:38AM +1100, Gavin Shan wrote: >On Wed, Jan 21, 2015 at 09:56:07AM +1100, Gavin Shan wrote: >>On Tue, Jan 20, 2015 at 10:28:16AM +0100, Benjamin Herrenschmidt wrote: >>>On Mon, 2015-01-19 at 09:47 +1100, Gavin Shan wrote: >>>> On pseries platform, the EEH reset backend pseries_eeh_reset() can >>>> be called in atomic context as follows. For this case, we should >>>> call udelay() instead of msleep() to avoid context switching. >>>> >>>> drivers/scsi/ipr.c::ipr_reset_slot_reset_done() >>>> drivers/pci/pci.c::pci_set_pcie_reset_state() >>>> arch/powerpc/kernel/eeh.c::pcibios_set_pcie_reset_state() >>>> arch/powerpc/platforms/pseries/eeh_pseries.c::pseries_eeh_reset() >>> >>>It's not acceptable to introduce multi-millisecond delays at interrupt >>>time. In fact, we should generally not use udelay in such context. >>> >>>I understand that this is an exceptional error handling case but it's >>>still not right. >>> >> >>Yes, I agree it's unsafe to udelay for multi-milliseconds as the queued >>works in atomic context is expected to be completed as soon as possible. >> >>>Are there many other users of pci_set_pcie_reset_state() at interrupt >>>time ? Can we have a discussion with the PCI folks as to whether that >>>should be legal or not ? >>> >>>I'm tempted to require that it's made illegal. >> >>Currently, there are 2 drivers calling this function: IPR and misc/genwqe. >>Also, VFIO would call this function for IBM and Mellanox adapters in PowerKVM >>repository. For now, IPR driver is the only one call this function in atomic >>context. >> >>Sure, I'll send one email to confirm with PCI folks. I guess it's illegal >>to call pci_set_pcie_reset_state() in atomic context. If it's the case, >>I'm afraid Wendy has to change IPR driver to replace the reset timer with >>something else (e.g. workqueue). >> > >Another way is to drop the hold/settle delays for >pcibios_set_pcie_reset_state() >and IPR relies on the timer interval to cover them. Wendy, could you please >let me know if it would work for you or not? > > Start reset timer; > Timer expires, assert the reset. Restart the timer with assert delay; > Timer expires, deassert the reset. Restart the timer with settle delay; > Timer expires, ready for subsequent works; >
Brian King is the author of pci_set_pcie_reset_state() and as he said, it was expected to be called in atomic context. So I'm going to come up with another approach as above, caller of pci_set_pcie_reset_state() should take corresponding delays as what IPR driver is doing. Messages from Brian for reference: | The API has changed. I wrote the pci_set_pcie_reset_state API originally. | When this API was put in place initially, it was perfectly legal to call it | from an atomic context. Can you clarify why we have to have the delay in the | pci_set_pcie_reset_state function? Shouldn't it be the responsibility of the | callers to ensure a proper delay is used? This was always the case until | recently. So please ignore this patch and I'll send another one, which is implemented in above approach. Thanks, Gavin >>>> Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com> >>>> Tested-by: Wen Xiong<wenxi...@linux.vnet.ibm.com> >>>> --- >>>> arch/powerpc/platforms/pseries/eeh_pseries.c | 12 ++++++++---- >>>> 1 file changed, 8 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c >>>> b/arch/powerpc/platforms/pseries/eeh_pseries.c >>>> index a6c7e19..67623a3 100644 >>>> --- a/arch/powerpc/platforms/pseries/eeh_pseries.c >>>> +++ b/arch/powerpc/platforms/pseries/eeh_pseries.c >>>> @@ -503,8 +503,7 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, >>>> int *state) >>>> */ >>>> static int pseries_eeh_reset(struct eeh_pe *pe, int option) >>>> { >>>> - int config_addr; >>>> - int ret; >>>> + int config_addr, delay, ret; >>>> >>>> /* Figure out PE address */ >>>> config_addr = pe->config_addr; >>>> @@ -528,9 +527,14 @@ static int pseries_eeh_reset(struct eeh_pe *pe, int >>>> option) >>>> /* We need reset hold or settlement delay */ >>>> if (option == EEH_RESET_FUNDAMENTAL || >>>> option == EEH_RESET_HOT) >>>> - msleep(EEH_PE_RST_HOLD_TIME); >>>> + delay = EEH_PE_RST_HOLD_TIME; >>>> + else >>>> + delay = EEH_PE_RST_SETTLE_TIME; >>>> + >>>> + if (in_atomic()) >>>> + udelay(delay * 1000); >>>> else >>>> - msleep(EEH_PE_RST_SETTLE_TIME); >>>> + msleep(delay); >>>> >>>> return ret; >>>> } >>> >>> _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev