----- Original Message -----
> From: "Michael Ellerman" <m...@ellerman.id.au>
> To: "Timothy Pearson" <tpear...@raptorengineering.com>, "linuxppc-dev" 
> <linuxppc-dev@lists.ozlabs.org>
> Cc: "Jens Axboe" <ax...@kernel.dk>
> Sent: Tuesday, November 14, 2023 6:14:37 AM
> Subject: Re: [PATCH] powerpc: Fix data corruption on IPI

> Hi Timothy,
> 
> Thanks for debugging this, but I'm unclear why this is helping because
> we should already have a full barrier (hwsync) on both the sending and
> receiving side.
> 
> More below.

So it looks like we might be dealing with a couple of potentially separate 
issues here, not sure.  This is probably the most infuriating bug I've run 
across in my career, so please bear with me -- with the amount of active test 
kernels I have installed at any one time I'm occassionally infecting one with 
the tests from another. ;)

First, I'm not 100% convinced the code in try_to_wake_up() is race-free on 
ppc64, since adding a memory barrier between it and the call to kick_process() 
significantly reduces the frequency of the on-disk corruption.  Can you take a 
look and see if anything stands out as to why that would be important?

The second part of this though is that the barrier only reduces the frequency 
of the corruption, it does not eliminate the corruption.  After some 
consolidation, what completely eliminates the corruption is a combination of:
 * Switching to TWA_SIGNAL_NO_IPI in task_work_add() * 
 * Adding udelay(1000) in io_uring/rw.c:io_write(), right before the call to 
io_rw_init_file()

Adding a memory barrier instead of the udelay() doesn't work, nor does adding 
the delay without switching to NO_IPI.

[1] Keeping TWA_SIGNAL and adding the barrier instruction also works, but this 
is conceptually simpler to understand as to why it would have an effect at all

Reply via email to