On Mon, Apr 15, 2019 at 06:20:01PM -0700, Cong Wang wrote: > cec_timer_fn() is a timer callback which reads ce_arr.array[] > and updates its decay values. Elements could be added to or > removed from this global array in parallel, although the array > itself will not grow or shrink. del_lru_elem_unlocked() uses > FULL_COUNT() as a key to find a right element to remove, > which could be affected by the parallel timer. > > Fix this by converting the mutex to a spinlock and holding it > inside the timer. > > Fixes: 011d82611172 ("RAS: Add a Corrected Errors Collector") > Cc: Tony Luck <tony.l...@intel.com> > Cc: Borislav Petkov <b...@alien8.de> > Cc: Thomas Gleixner <t...@linutronix.de> > Signed-off-by: Cong Wang <xiyou.wangc...@gmail.com> > --- > drivers/ras/cec.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/drivers/ras/cec.c b/drivers/ras/cec.c > index 61332c9aab5a..a82c9d08d47a 100644 > --- a/drivers/ras/cec.c > +++ b/drivers/ras/cec.c > @@ -117,7 +117,7 @@ static struct ce_array { > }; > } ce_arr; > > -static DEFINE_MUTEX(ce_mutex); > +static DEFINE_SPINLOCK(ce_lock);
Nah, pls keep the simplicity here by retaining the mutex. Use a workqueue instead to queue the spring cleaning from IRQ context and then do it when back in preemptible context. Thx. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.