Re: [Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

Nicolai Hähnle Sun, 17 Apr 2016 11:13:46 -0700

On 15.04.2016 12:50, Grigori Goronzy wrote:

apps that cause a lot of synchronization benefit from small IB
sizes. The current IB size is a bit on the large side for this class
of apps. On the other hand, if there isn't much synchronization going
on, increasing the IB size can slightly improve performance, too.


Here's a quick hack that tunes the IB size based on feedback from
buffer_wait_time. What do you think? I see good results with Unigine
Heaven (no synchronization, benefits from larger IB size), Metro Last
Light (lots of synchronization, benefits from small IBs) as well as
OpenArena and Xonotic (same).

Interesting, and thanks for poking at this issue. I've been thinkingabout tuning IB sizes as well. I'd like for us to get this right, so Iwonder: What's your theory for _why_ your change helps?

I'll be honest with you: Right now, I think your approach contains toomuch unexplained "magic". What's the theory that explains using bufferwait averages in this way?

My theory for why your change helps is about CPU/GPU parallelism. Whenwe wait for buffer idle, this most likely means the GPU becomes idle.[1]If you use a large IB to start the GPU up again, you'll wait a longertime before the GPU starts doing work again. Basically, in ASCII art:


                GPU idle
GPU =========+..............+=====
      |      |              |
CPU ==+......+==============+=====
       buffer
        wait

By reducing the size of the IB, the picture changes like this:

             GPU idle
GPU =========+......+=====
      |      |      |
CPU ==+......+======+=====
       buffer
       wait

It takes a shorter amount of CPU time before the GPU gets new work, theGPU is utilized more fully and the program runs faster.

If this explanation is correct and all there is to it, then it suggestthe logic for when IBs should be shorter. Basically, we should use shortIBs when the GPU is idle.[2]

There are a bunch of different options. A simple one that comes closestto what your patch does - without actually querying for GPU idle - is tojust make the first IB after each buffer wait a small one. The length ofthe buffer wait doesn't seem important because what we need to addressis the fact that the GPU is idle. That's a boolean matter.

Because of [1] it would probably be a better approach to use fences todetermine whether or how many previous IBs are still in flight.


Cheers,
Nicolai

[1] Although not necessarily. We may be trying to map a buffer that isstill in flight, but only referenced by an older IB.

[2] Or about to become idle, since we want to keep the pipeline full.Although both doesn't apply in the rare case where the CPU driveroverhead for constructing the IBs is consistently higher than the GPUwork that needs to be done for those IBs. In that case, we should stilluse large IBs to reduce the driver overhead.

Note: this patch applies on top of Bas' constant engine patchset.

Grigori

In-Reply-To:

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] dynamic IB size tuning for radeonsi

Reply via email to