On Fri, Mar 03, 2017 at 03:47:18PM +1100, Russell Currey wrote: >eeh_handle_special_event() is called when an EEH event is detected but >can't be narrowed down to a specific PE. This function looks through >every PE to find one in an erroneous state, then calls the regular event >handler eeh_handle_normal_event() once it knows which PE has an error. > >However, if eeh_handle_normal_event() found that the PE cannot possibly >be recovered, it will remove the PE and associated devices. This leads >to a use after free in eeh_handle_special_event() as it attempts to clear >the "recovering" state on the PE after eeh_handle_normal_event() returns. > >Thus, make sure the PE is valid when attempting to clear state in >eeh_handle_special_event(). >
>From the changelog, I don't see how the PE is free'd. Could you explain a bit about it? >Cc: <sta...@vger.kernel.org> #3.10+ >Reported-by: Alexey Kardashevskiy <a...@ozlabs.ru> >Signed-off-by: Russell Currey <rus...@russell.cc> >--- > arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > >diff --git a/arch/powerpc/kernel/eeh_driver.c >b/arch/powerpc/kernel/eeh_driver.c >index b94887165a10..492397298a2a 100644 >--- a/arch/powerpc/kernel/eeh_driver.c >+++ b/arch/powerpc/kernel/eeh_driver.c >@@ -983,6 +983,19 @@ static void eeh_handle_special_event(void) > if (rc == EEH_NEXT_ERR_FROZEN_PE || > rc == EEH_NEXT_ERR_FENCED_PHB) { > eeh_handle_normal_event(pe); >+ >+ /* >+ * eeh_handle_normal_event() can free the PE if it >+ * determines that the PE cannot possibly be recovered. >+ * Make sure the PE still exists before changing its >+ * state. >+ */ >+ if (!pe || (pe->type & EEH_PE_INVALID) >+ || (pe->state & EEH_PE_REMOVED)) { >+ pr_warn("EEH: not clearing state on bad PE\n"); >+ continue; >+ } >+ It seems not correct. @pe has set to the valid PE in advance, the !pe is always false? If the PE has been free'd, how can we access @pe->type here and how can we make sure PE_INVALID and PE_REMOVED flag wasn't overwritten by somebody else? > eeh_pe_state_clear(pe, EEH_PE_RECOVERING); > } else { > pci_lock_rescan_remove(); Cheers, Gavin