On Mon, Feb 25, 2019 at 10:32:27AM -0800, Kenneth Graunke wrote: > On Monday, February 25, 2019 6:33:11 AM PST Pohjolainen, Topi wrote: > > On Thu, Nov 01, 2018 at 08:04:21PM -0700, Kenneth Graunke wrote: > > > This implements virtually all documented PIPE_CONTROL restrictions > > > in a centralized helper. You now simply ask for the operations you > > > want, and the pipe control "brain" will figure out exactly what pipe > > > controls to emit to make that happen without tanking your system. > > > > > > The hope is that this will fix some intermittent flushing issues as > > > well as GPU hangs. However, it also has a high risk of causing GPU > > > hangs and other regressions, as this is a particularly sensitive > > > area and poking the bear isn't always advisable. > > > > First I checked I could find all the things in bspec. There was one that I > > couldn't, noted further down. > > > > Second I checked that all the rules earlier were implemented. Found one > > exception, noted further down as well. > > > > I didn't check if the rules still miss something in bspec. That would be > > another exercise... > > [snip] > > > + /* Recursive PIPE_CONTROL workarounds -------------------------------- > > > + * (http://knowyourmeme.com/memes/xzibit-yo-dawg) > > > + * > > > + * We do these first because we want to look at the original > > > operation, > > > + * rather than any workarounds we set. > > > + */ > > > + if (GEN_GEN == 6 && (flags & PIPE_CONTROL_RENDER_TARGET_FLUSH)) { > > > + /* Hardware workaround: SNB B-Spec says: > > > + * > > > + * "[Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush > > > + * Enable = 1, a PIPE_CONTROL with any non-zero post-sync-op is > > > + * required." > > > + */ > > > + brw_emit_post_sync_nonzero_flush(brw); > > > + } > > > + > > > + if (GEN_GEN == 9 && (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) { > > > + /* The PIPE_CONTROL "VF Cache Invalidation Enable" bit description > > > + * lists several workarounds: > > > + * > > > + * "Project: SKL, KBL, BXT > > > + * > > > + * If the VF Cache Invalidation Enable is set to a 1 in a > > > + * PIPE_CONTROL, a separate Null PIPE_CONTROL, all bitfields > > > + * sets to 0, with the VF Cache Invalidation Enable set to 0 > > > + * needs to be sent prior to the PIPE_CONTROL with VF Cache > > > + * Invalidation Enable set to a 1." > > > + */ > > > + genX(emit_raw_pipe_control)(brw, 0, NULL, 0, 0); > > > + } > > > + > > > + if (GEN_GEN == 9 && IS_COMPUTE_PIPELINE(brw) && post_sync_flags) { > > > + /* Project: SKL / Argument: LRI Post Sync Operation [23] > > > + * > > > + * "PIPECONTROL command with “Command Streamer Stall Enable” must > > > be > > > + * programmed prior to programming a PIPECONTROL command with "LRI > > > + * Post Sync Operation" in GPGPU mode of operation (i.e when > > > + * PIPELINE_SELECT command is set to GPGPU mode of operation)." > > > + * > > > + * The same text exists a few rows below for Post Sync Op. > > > + */ > > > + genX(emit_raw_pipe_control)(brw, PIPE_CONTROL_CS_STALL, bo, > > > offset, imm); > > > > Are bo, offset, imm needed here as well? > > No, I don't think so. We should pass NULL, 0, 0 here - we're just doing > a plain CS stall separately. We'll use them for the actual write later. > > Good catch! > > [snip] > > > - if (GEN_GEN >= 9) { > > > - /* THE PIPE_CONTROL "VF Cache Invalidation Enable" docs > > > continue: > > > - * > > > - * "Project: BDW+ > > > - * > > > - * When VF Cache Invalidate is set “Post Sync Operation” > > > must > > > - * be enabled to “Write Immediate Data” or “Write PS > > > Depth > > > - * Count” or “Write Timestamp”." > > > - * > > > - * If there's a BO, we're already doing some kind of write. > > > - * If not, add a write to the workaround BO. > > > - * > > > - * XXX: This causes GPU hangs on Broadwell, so restrict it to > > > - * Gen9+ for now...see this bug for more information: > > > - * https://bugs.freedesktop.org/show_bug.cgi?id=103787 > > > > In "Flush Types" workarounds later on you apply this for gen8 as well. > > Yes, that's intentional - we're supposed to according to the docs. > I re-tested the Piglit test from bug 103787 on my Broadwell, and it > works fine - no GPU hangs. I think we were just doing it wrong before. > > Trying to figure out an ordering for the workarounds is awful... :(
What would you think about another patch just before this to enable that for gen8? Just in case it causes problems it would bisect to much smaller patch. > > > > - */ > > > - if (!bo) { > > > - flags |= PIPE_CONTROL_WRITE_IMMEDIATE; > > > - bo = brw->workaround_bo; > > > - } > > > - } > > > + if (GEN_IS_HASWELL) { > > > + /* From the PIPE_CONTROL page itself: > > > + * > > > + * "HSW - Programming Note: PIPECONTROL with RO Cache > > > Invalidation: > > > + * Prior to programming a PIPECONTROL command with any of the > > > RO > > > + * cache invalidation bit set, program a PIPECONTROL flush > > > command > > > + * with “CS stall” bit and “HDC Flush” bit set." > > > + * > > > + * TODO: Actually implement this. What's an HDC Flush? > > > > There is bit 9 HDC Flush but that is defined for ICL, for HSW I couldn't > > find > > anything either. > > Yeah...I figured I'd call it out, but...I don't think we're doing > anything today either, so this is at least no worse. > > > > + */ > > > + } > > > + > > > + if (flags & PIPE_CONTROL_FLUSH_LLC) { > > > + /* From the PIPE_CONTROL instruction table, bit 26 (Flush LLC): > > > + * > > > + * "Project: ALL > > > + * SW must always program Post-Sync Operation to "Write > > > Immediate > > > + * Data" when Flush LLC is set." > > > + * > > > + * For now, we just require the caller to do it. > > > + */ > > > + assert(flags & PIPE_CONTROL_WRITE_IMMEDIATE); > > > + } > > > + > > > + /* "Post-Sync Operation" workarounds -------------------------------- > > > */ > > > + > > > + /* Project: All / Argument: Global Snapshot Count Reset [19] > > > + * > > > + * "This bit must not be exercised on any product. > > > + * Requires stall bit ([20] of DW1) set." > > > + * > > > + * We don't use this, so we just assert that it isn't used. The > > > + * PIPE_CONTROL instruction page indicates that they intended this > > > + * as a debug feature and don't think it is useful in production, > > > + * but it may actually be usable, should we ever want to. > > > + */ > > > + assert((flags & PIPE_CONTROL_GLOBAL_SNAPSHOT_COUNT_RESET) == 0); > > > + > > > + if (flags & (PIPE_CONTROL_MEDIA_STATE_CLEAR | > > > + PIPE_CONTROL_INDIRECT_STATE_POINTERS_DISABLE)) { > > > + /* Project: All / Arguments: > > > + * > > > + * - Generic Media State Clear [16] > > > + * - Indirect State Pointers Disable [16] > > > + * > > > + * "Requires stall bit ([20] of DW1) set." > > > + * > > > + * Also, the PIPE_CONTROL instruction table, bit 16 (Generic Media > > > + * State Clear) says: > > > + * > > > + * "PIPECONTROL command with “Command Streamer Stall Enable” > > > must be > > > + * programmed prior to programming a PIPECONTROL command with > > > "Media > > > + * State Clear" set in GPGPU mode of operation" > > > + * > > > + * This is a subset of the earlier rule, so there's nothing to do. > > > + */ > > > + flags |= PIPE_CONTROL_CS_STALL; > > > + } > > > + > > > + if (flags & PIPE_CONTROL_STORE_DATA_INDEX) { > > > + /* Project: All / Argument: Store Data Index > > > + * > > > + * "Post-Sync Operation ([15:14] of DW1) must be set to something > > > other > > > + * than '0'." > > > + * > > > + * For now, we just assert that the caller does this. We might > > > want to > > > + * automatically add a write to the workaround BO... > > > + */ > > > + assert(non_lri_post_sync_flags != 0); > > > + } > > > + > > > + if (flags & PIPE_CONTROL_SYNC_GFDT) { > > > + /* Project: All / Argument: Sync GFDT > > > + * > > > + * "Post-Sync Operation ([15:14] of DW1) must be set to something > > > other > > > + * than '0' or 0x2520[13] must be set." > > > + * > > > + * For now, we just assert that the caller does this. > > > + */ > > > + assert(non_lri_post_sync_flags != 0); > > > + } > > > + > > > + if (IS_GENx10_BETWEEN(60, 75) && (flags & > > > PIPE_CONTROL_TLB_INVALIDATE)) { > > > + /* Project: SNB, IVB, HSW / Argument: TLB inv > > > + * > > > + * "{All SKUs}{All Steppings}: Post-Sync Operation ([15:14] of DW1) > > > + * must be set to something other than '0'." > > > + * > > > + * For now, we just assert that the caller does this. > > > + */ > > > + assert(non_lri_post_sync_flags != 0); > > > + } > > > + > > > + if (GEN_GEN >= 7 && (flags & PIPE_CONTROL_TLB_INVALIDATE)) { > > > + /* Project: IVB+ / Argument: TLB inv > > > + * > > > + * "Requires stall bit ([20] of DW1) set." > > > + * > > > + * Also, from the PIPE_CONTROL instruction table: > > > + * > > > + * "Project: SKL+ > > > + * Post Sync Operation or CS stall must be set to ensure a TLB > > > + * invalidation occurs. Otherwise no cycle will occur to the > > > TLB > > > + * cache to invalidate." > > > + * > > > + * This is not a subset of the earlier rule, so there's nothing to > > > do. > > > + */ > > > + flags |= PIPE_CONTROL_CS_STALL; > > > + } > > > + > > > + if (GEN_GEN == 9 && devinfo->gt == 4) { > > > + /* TODO: The big Skylake GT4 post sync op workaround */ > > > + } > > > + > > > + /* "GPGPU specific workarounds" (both post-sync and flush) > > > ------------ */ > > > + > > > + if (IS_COMPUTE_PIPELINE(brw)) { > > > + if (GEN_GEN >= 9 && (flags & > > > PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE)) { > > > + /* Project: SKL+ / Argument: Tex Invalidate > > > + * "Requires stall bit ([20] of DW) set for all GPGPU > > > Workloads." > > > + */ > > > + flags |= PIPE_CONTROL_CS_STALL; > > > } > > > > > > - if (GEN_GEN == 10) > > > - gen10_add_rcpfe_workaround_bits(&flags); > > > - } else if (GEN_GEN >= 6) { > > > - if (GEN_GEN == 6 && > > > - (flags & PIPE_CONTROL_RENDER_TARGET_FLUSH)) { > > > - /* Hardware workaround: SNB B-Spec says: > > > + if (GEN_GEN == 8 && (post_sync_flags || > > > + (flags & (PIPE_CONTROL_NOTIFY_ENABLE | > > > + PIPE_CONTROL_DEPTH_STALL | > > > + PIPE_CONTROL_RENDER_TARGET_FLUSH | > > > + PIPE_CONTROL_DEPTH_CACHE_FLUSH | > > > + PIPE_CONTROL_DATA_CACHE_FLUSH)))) { > > > + /* Project: BDW / Arguments: > > > * > > > - * [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache > > > Flush > > > - * Enable = 1, a PIPE_CONTROL with any non-zero post-sync-op > > > is > > > - * required. > > > + * - LRI Post Sync Operation [23] > > > + * - Post Sync Op [15:14] > > > + * - Notify En [8] > > > + * - Depth Stall [13] > > > + * - Render Target Cache Flush [12] > > > + * - Depth Cache Flush [0] > > > + * - DC Flush Enable [5] > > > + * > > > + * "Requires stall bit ([20] of DW) set for all GPGPU and > > > Media > > > + * Workloads." > > > > This I couldn't find. > > Ah, sorry, this could probably be clearer. > > On the "Programming Restrictions for PIPE_CONTROL" subpages ("Post-Sync > Operation", "Flush Types"), there are separate table rows for each bit, > with the same workaround text: > > +--------------------------------------------------------------------+ > | LRI Post | 23 | Requires stall bit ([20] of DW) set for all | BDW | > | Sync | | GPGPU and Media Workloads. | | > | Operation | | | | > +--------------------------------------------------------------------+ > > I just grouped them together because they're all the same workaround... > just documented in 8 different table rows. > > I'm definitely open to suggestions on how to improve the comment :) Ah, now I got it. Makes sense. Perhaps just that it combines the rules for BDW from the two tables. > > Thanks so much for looking at this! I owe you! No problem, writing this patch in the first place must have not been that easy :) Reviewed-by: Topi Pohjolainen <topi.pohjolai...@intel.com> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev