> -----Original Message----- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Bloomfield, Jon > Sent: Wednesday, December 6, 2017 9:01 AM > To: Chris Wilson <ch...@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org > Cc: Daniel Vetter <daniel.vet...@ffwll.ch> > Subject: Re: [Intel-gfx] [PATCH v2] drm/i915: Prevent machine hang from > Broxton's vtd w/a and error capture > > > -----Original Message----- > > From: Chris Wilson [mailto:ch...@chris-wilson.co.uk] > > Sent: Wednesday, December 6, 2017 7:38 AM > > To: intel-gfx@lists.freedesktop.org > > Cc: Chris Wilson <ch...@chris-wilson.co.uk>; Bloomfield, Jon > > <jon.bloomfi...@intel.com>; Harrison, John C > <john.c.harri...@intel.com>; > > Ursulin, Tvrtko <tvrtko.ursu...@intel.com>; Joonas Lahtinen > > <joonas.lahti...@linux.intel.com>; Daniel Vetter <daniel.vet...@ffwll.ch> > > Subject: [PATCH v2] drm/i915: Prevent machine hang from Broxton's vtd > w/a > > and error capture > > > > Since capturing the error state requires fiddling around with the GGTT > > to read arbitrary buffers and is itself run under stop_machine(), it > > deadlocks the machine (effectively a hard hang) when run in conjunction > > with Broxton's VTd workaround to serialize GGTT access. > > > > v2: Store the ERR_PTR in first_error so that the error can be reported > > to the user via sysfs. > > > > Fixes: 0ef34ad6222a ("drm/i915: Serialize GTT/Aperture accesses on BXT") > > Signed-off-by: Chris Wilson <ch...@chris-wilson.co.uk> > > Cc: Jon Bloomfield <jon.bloomfi...@intel.com> > > Cc: John Harrison <john.c.harri...@intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursu...@intel.com> > > Cc: Joonas Lahtinen <joonas.lahti...@linux.intel.com> > > Cc: Daniel Vetter <daniel.vet...@ffwll.ch> > > It's a real shame to lose error capture on BXT. Can we wrap stop_machine to > make it recursive ? > > Something like... > > static cpumask_t sm_mask; > > struct sm_args { > cpu_stop_fn_t *fn; > void *data; > }; > > void do_recursive_stop(void *sm_arg_data) > { > struct sm_arg *args = sm_arg_data; > > /* We're stopped - flag the fact to prevent recursion */ > cpumask_set_cpu(smp_processor_id(), &sm_mask); > > args->fn(args->data); > > /* Re-enable recursion */ > cpumask_clear_cpu(smp_processor_id(), &sm_mask); > } > > void recursive_stop_machine(cpu_stop_fn_t fn, void *data) > { > if (cpumask_test_cpu(smp_processor_id(), &sm_mask)) { > /* We were already stopped, so can just call directly */ > fn(data); > } > else { > /* Our CPU is not currently stopped */ > struct sm_args *args = {fn, data}; > stop_machine(do_recursive_stop, args, NULL); > } > }
... I think a single bool is sufficient in place of the cpumask, since it is set and cleared within stop_machine - I started out trying to set/clear outside. _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx