On 15/02/2017 16:33, Chris Wilson wrote:
On Wed, Feb 15, 2017 at 04:06:34PM +0000, Tvrtko Ursulin wrote:
+static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
+{
+       static const u32 pc6[6] = { GFX_OP_PIPE_CONTROL(6), 0, 0, 0, 0, 0 };
+
+       memcpy(batch, pc6, sizeof(pc6));
+
+       batch[1] = flags;
+       batch[2] = offset;
+
+       return batch + 6;

godbolt would seem to say it is best to use
static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
{
        batch[0] = GFX_OP_PIPE_CONTROL(6);
        batch[1] = flags;
        batch[2] = offset;
        batch[3] = 0;
        batch[4] = 0;
        batch[5] = 0;

        return batch + 6;
}

Yeah agreed, it was a bit silly. I falsely remember it had quite good effects on the optimisation gcc was able to do but couldn't repro that.

How about though replacing the last three assignments with memset(&batch[3], 0, 3 * sizeof(u32))? That is indeed helpful on 64-bit.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to