On 03/03/17 15:47, Russell Currey wrote: > eeh_handle_special_event() is called when an EEH event is detected but > can't be narrowed down to a specific PE. This function looks through > every PE to find one in an erroneous state, then calls the regular event > handler eeh_handle_normal_event() once it knows which PE has an error. > > However, if eeh_handle_normal_event() found that the PE cannot possibly > be recovered, it will remove the PE and associated devices. This leads > to a use after free in eeh_handle_special_event() as it attempts to clear > the "recovering" state on the PE after eeh_handle_normal_event() returns. > > Thus, make sure the PE is valid when attempting to clear state in > eeh_handle_special_event(). > > Cc: <sta...@vger.kernel.org> #3.10+ > Reported-by: Alexey Kardashevskiy <a...@ozlabs.ru> > Signed-off-by: Russell Currey <rus...@russell.cc> > --- > arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/powerpc/kernel/eeh_driver.c > b/arch/powerpc/kernel/eeh_driver.c > index b94887165a10..492397298a2a 100644 > --- a/arch/powerpc/kernel/eeh_driver.c > +++ b/arch/powerpc/kernel/eeh_driver.c > @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void) > if (rc == EEH_NEXT_ERR_FROZEN_PE || > rc == EEH_NEXT_ERR_FENCED_PHB) { > eeh_handle_normal_event(pe); > + > + /* > + * eeh_handle_normal_event() can free the PE if it > + * determines that the PE cannot possibly be recovered. > + * Make sure the PE still exists before changing its > + * state. > + */ > + if (!pe || (pe->type & EEH_PE_INVALID) > + || (pe->state & EEH_PE_REMOVED)) {
The bug is that pe becomes stale after eeh_handle_normal_event() returned and dereferencing it afterwards is broken. > + pr_warn("EEH: not clearing state on bad PE\n"); > + continue; > + } > + > eeh_pe_state_clear(pe, EEH_PE_RECOVERING); > } else { > pci_lock_rescan_remove(); > -- Alexey