Hello,

When profiling my workload on an AMD E-350 (PALM GPU) to see why it still
wasn't performing well with Jerome's WIP macrotiling patches, I noticed that
r600_fence_finish was taking 10% of my CPU time. I determined experimentally
that changing from sched_yield() to os_time_sleep(10) fixed this and
resolved my last performance issue on AMD Fusion as compared to Intel Atom,
but felt that this was hacky.

I've therefore tried to use INT_SEL of 0b10 in the EVENT_WRITE_EOP in Mesa,
combined with a new ioctl to wait for a changed value, but it's not working
the way I would expect. I'll be sending patches as replies to this message,
so that you can see exactly what I've done, but in brief, I have an ioctl
that uses wait_event to wait for a chosen offset in a BO to change
value. I've added a suitable waitqueue, and made radeon_fence_process call
wake_up_all.

I'm seeing behaviour from this that I can't explain; as you'll see in the
patches, I've moved some IRQ prints from DRM_DEBUG to printk(KERN_INFO), and
I'm seeing that I don't get the EOP interrupt in a timely fashion - either
because memory is not as coherent between the GPU and CPU as I would like
(so I'm reading stale data when I call wait_event), or because the interrupt
is genuinely delayed.

Reply via email to