Hi Brian, On Mon, Mar 18, 2019 at 04:59:21PM +0000, Brian Starkey wrote: > Hi Laurent, > > Sorry for the delay, I was travelling last week and didn't find a > chance to digest your diagrams (thanks for all the detail!)
No worries, and thank you for the time you're spending on this. In the meantime I've found a solution that solves the race, using an entirely different mechanism. It's all explained in v6 of the patch series. > On Fri, Mar 08, 2019 at 02:24:40PM +0200, Laurent Pinchart wrote: > > On Thu, Mar 07, 2019 at 12:28:08PM +0000, Brian Starkey wrote: > > > On Wed, Mar 06, 2019 at 08:22:44PM +0200, Laurent Pinchart wrote: > > > > > > [snip] > > > > > > > I can always queue a new one, but I have no way of telling if the newly > > > > queued list raced with the frame end interrupt. > > > > > > > > There's another register I'm not using that contains a shadow copy of > > > > the registers list DMA address. It mirrors the address programmed by the > > > > driver when there is no DMA of the registers list in progress, and > > > > contains the address the of registers list being DMA'ed when a DMA is in > > > > progress. I don't think I can do much with it, as reading it either > > > > before or after reprogramming (to check if I can reprogram or if the > > > > reprogram has been taken into account) is racy. > > > > > > When you say it mirrors the address programmed, is that latched when > > > the update is accepted, or it will update the moment you program the > > > address register? > > > > It is latched when the update is processed, at the next vblank following > > the address programming. The timing diagram below shows vblank, the UPD > > bit I use for synchronization, the registers list address register > > exposed to the CPU, and the internal address used to DMA the registers > > list. The address register is written four times to show how the > > hardware reacts, but in practice there will be a single write per frame, > > right after the beginning of vblank. > > > > DMA starts > > | DMA ends > > | | > > V V > > ___________ > > VBLANK ______________________| |________ > > ________________ ________________ > > UPD _____| |___| > > _____ ______ _____________ ________ _______ > > ADDR register __A__X__B___X______C______X___D____X___E___ > > ______________________ ____________________ > > ADDR internal ___________A__________X__________C_________ > > > > I can reprogram the address any number of times I want before the > > vblank, but the update bit mechanism only lets me protect the race > > related to the first write. For any subsequent write I won't be able to > > tell whether it was complete before the hardware started the frame, or > > arrived too late. > > > > > I assume the latter or you would have thought of this yourself (that > > > seems like a really strange/not-useful behaviour!). But if it is the > > > former you could: > > > > > > - In writeback start-of-frame, create a copy of the register list, > > > disabling writeback. > > > - Write the address of this copy to the register > > > > > > If/when an atomic commit comes in before you service the next > > > end-of-frame: > > > > > > - Write the address of the new register list > > > - Check the status register. If the "pending" bit is zero, you know > > > you had a potential race. > > > - Check the DMA address register. If it corresponds to the new > > > scene, the new commit won the race, otherwise it's been delayed > > > by a frame. > > > > The issue here is that there's a potential race if the pending (UPD) bit > > is one too. If the commit arrives just before vblank, but the address is > > written just after vblank, after the UPD bit has been reset to 0 by the > > hardware, but before the vblank interrupt is processed, then the commit > > won't be applied yet, and I will have no way to know. Compare the two > > following scenarios: > > > > [1] [2] [3] [4] [5] > > | | | | | > > | V V | V > > V __________ V __________ > > VBLANK _______________| |________________| |__ > > _________ _______________________ > > UPD _____| |___| |_____________ > > _____ _____________ _____________ _______________________ > > ADDR register __A__X______B______X______C______X___________D___________ > > ___________________________________________ _____________ > > ADDR internal _______A_______X______B____________________X______D______ > > > > [1] Atomic commit, registers list address write, with writeback enabled > > [2] DMA starts for first atomic commit > > [3] Writeback disable registers list address write > > [4] Next atomic commit, with writeback disabled > > [5] DMA starts for second atomic commit > > > > [1] [2] [3] [4][5] > > | | | | | > > | V V V V > > V __________ __________ > > VBLANK _______________| |________________| |__ > > _________ _______________________ __________ > > UPD _____| |___| |__| > > _____ _____________ __________________________ __________ > > ADDR register __A__X______B______X_____________C____________X_____D____ > > ___________________________________________ _____________ > > ADDR internal _______A_______X______B____________________X______C______ > > > > [1] Atomic commit, registers list address write, with writeback enabled > > [2] DMA starts for first atomic commit > > [3] Writeback disable registers list address write > > [4] DMA starts for writeback disable registers list (3) > > [5] Next atomic commit, with writeback disabled, performed right after > > vblank but befrore the vblank interrupt is serviced > > > > The UPD bit is 1 after writing the ADDR register the second time in both > > cases. Furthermore, if [4] and [5] are very close in the second case, > > the UPD bit may read 1 just before [5] if the read comes before [4]: > > > > read UPD bit; > > /* VBLANK [4] */ > > write ADDR register; > > > > I thus can't rely on UPD = 1 before the write meaning that the write was > > performed before vblank, and I can't rely either on the UPD bit after > > write, as it's 1 in both cases. > > My mistake, I got the UPD bit the wrong way around. I'm still not > entirely sure why you can't use "ADDR internal" to determine which > side won the race. It shows 'B' in the first case, and 'C' in the > second. Because ADDR internal isn't available to the CPU :-( > When a new commit comes, unconditionally: > - Write new address > - Read status > > if status.UPD == 0 --> You know for sure your new commit was just > latched. > if status.UPD == 1 --> You need to check ADDR internal to see which > of these three happened: > > 1) Your first case happened. We're somewhere in the middle of the > frame. ADDR internal will show 'B', and you know commit 'D' is > going on-screen at the next vblank. > > 2) Your second case happened. The new commit raced with the > latching of writeback-disable and "lost". ADDR internal will > show 'C', and the new commit is delayed by a frame > > 3) (Teeny tiny small window) In-between reading status and ADDR > internal, the new commit was latched. ADDR internal will show > 'D'. You know the new commit "won" so treat it the same as if > UPD == 0 (which it will be, now). > > Anyway, it's all moot now that you've found the chained lists thing - > that sounds ideal. I'll take a look at the new series shortly. It's a neat hardware feature, yes. We were already using it for a different purpose, I should have thought about it for writeback too from the very beginning. Please note I've sent a pull request for v7 as it has been fully reviewed. Nonetheless, if you find issues, I can fix them on top. > > I initially thought I could protect against the race using the following > > procedure. > > > > - Single atomic commit, no IRQ delay > > > > [1] [2] [3] [4] > > | | | | > > | V V V > > V __________ __________ > > VBLANK _______________| |________________| |__ > > _________ > > UPD _____| |_________________________________________ > > _____ ___________________________________________________ > > ADDR register __A__X___________________B_______________________________ > > _______________ _________________________________________ > > ADDR internal _______A_______X______B__________________________________ > > > > [1] Atomic commit, registers list address write, with writeback enabled > > [2] DMA starts for first atomic commit > > [3] Frame start, disable writeback in-place in registers list B > > [4] DMA starts for "patched" registers list, disables writeback > > > > - Two atomic commits, no IRQ delay > > > > [1] [2] [3] [4] [5] > > | | | | | > > | V V V V > > V __________ __________ > > VBLANK _______________| |________________| |__ > > _________ ______________________ > > UPD _____| |____| |_____________ > > _____ ______________ ____________________________________ > > ADDR register __A__X______B_______X_________________C__________________ > > _______________ ___________________________ _____________ > > ADDR internal _______A_______X_____________B_____________X______C______ > > > > [1] Atomic commit, registers list address write, with writeback enabled > > [2] DMA starts for first atomic commit > > [3] Next atomic commit, registers list address write, with writeback > > disabled > > [4] Frame start, disable writeback in-place in registers list B > > [5] DMA starts for second atomic commit, disables writeback > > > > [3] and [4] can happen in any order, as long as they both come before > > [5]. If [3] comes after [5], we're back to the previous case (Single > > atomic commit, no IRQ delay). > > > > - Single atomic commit, IRQ delay > > > > [1] [2] [3] [4] [5] > > | | | | | > > | V V V V > > V __________ __________ > > VBLANK _______________| |________________| |__ > > _________ > > UPD _____| |_________________________________________ > > _____ ___________________________________________________ > > ADDR register __A__X___________________B_______________________________ > > _______________ _________________________________________ > > ADDR internal _______A_______X______B__________________________________ > > > > [1] Atomic commit, registers list address write, with writeback enabled > > [2] DMA starts for first atomic commit > > [3] Frame start, IRQ is delayed to [5] > > [4] DMA starts for unmodified registers list, writeback still enable > > [5] disable writeback in-place in registers list B, too late > > > > Here I need to detect that [5] was delayed after [4], and thus delay the > > completion of the writeback job by one frame. This could be done by > > checking the vblank interrupt status bit, if it is set then vblank > > occurred and raced [5]. However, the same issue can also happen when no > > race occurred if processing of the vblank interrupt for [2] is delayed > > until [3]. Both the vblank interrupt and the frame start interrupt > > status bits will be set, indicate a potential race. > > > > The time between [2] and [3] is very short compared to the time between > > [3] and [4] and to interrupt latency in general, so we would have lots > > of false positives. > > > > > >> You don't happen to have a DMA engine trigger or something you could > > > >> use to do the register list modification at a guaranteed time do you? > > > > > > > > Not that I know of, unfortunately. > > > > > > > >> Are you always going to be protected by an IOMMU, preventing the > > > >> writeback from trashing physical memory? If that's not the case, then > > > >> the race can have pretty dire consequences. > > > > > > > > If the IOMMU is enabled in DT, yes. It's a system-level decision. > > > > > > Well, it's your driver at the end of the day. But for me, a known > > > race-condition which would cause random memory corruption sounds like > > > a really Bad Thing. Halving frame-rate on systems with no IOMMU seems > > > preferable to me. > > > > It is a really bad thing. I think the decision should be taken by > > Renesas though. -- Regards, Laurent Pinchart _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel