I'm surprised this made such a big difference. It should not take a
long time to flush the L2T -- no memory access should be involved, it
should just be a matter of the L2T iterating over its lines clearing
the tags. This should take thousands of V3D cycles, not milliseconds.
So the 3-4ms stall see
According to Dave, once you've started an L2T flush, all L2T accesses
will be blocked until the flush completes. This fixes a consistent
3-4ms stall between the ioctl and running the job, and 3DMMES Taiji
goes from 27fps to 110fps.
Signed-off-by: Eric Anholt
Fixes: 57692c94dcbe ("drm/v3d: Introd