Chad Versace <chad.vers...@linux.intel.com> writes:

> On 05/10/2013 10:16 AM, Eric Anholt wrote:
>> Chad Versace <chad.vers...@linux.intel.com> writes:
>>
>>> The drivers was setting MOCS (Memory Object Control State) to 0 for all
>>> objects. This patch sets it as following:
>>>      renderbuffer, depthbuffer => LLC uncacheable, L3 cacheable
>>>      texture, stencil, hiz => LLC cacheable, L3 cacheable
>>>
>>> The goal here is to avoid blowing out the LLC with too-large buffers.
>>>
>>> Performance gains:
>>>      Haswell Harris Beach GT3
>>>      Android 4.2.2
>>>      kernel based on 3.8-4fc7c97
>>>
>>>      GLBenchmark 2.5.1 Egypt HD C24Z16 Offscreen DXT1
>>>     +32.0309% +/- 0.775397%,  n = 5, 95% confidence
>>>
>>>      GLBenchmark 2.7 T-Rex HD C24Z16 Offscreen Fixed timestep ETC1
>>>     +20.2435% +/- 0.821163%,  n = 5, 95% confidence
>>>
>>> Tested-by: Matt Turner <matts...@gmail.com>
>>> Signed-off-by: Chad Versace <chad.vers...@linux.intel.com>
>>
>> There are two separate changes here:
>>
>> - Make textures L3 cacheable.
>> - Change the RB caching to something new.
>>
>> The L3 for textures seems obviously good.  That cache gets used for
>> almost nothing else currently (VS pull constants, which are small, and
>> instruction cache, is a bit larger but the working set is still very
>> small at any time within a frame).
>>
>> The render cache change needs more data.  It seems obvious to me, and
>> the spec spells it out, that a change like this is trying to tune the
>> workload so that things that get cache hits less frequently (render
>> targets) don't get put in LLC such that their less-likely-to-hit
>> accesses push out something that would have been likely to have a cache
>> hit (texturing).
>>
>> So, what if your render targets and your textures *both* fit in LLC?
>> This change needs testing across multiple apps and resolutions.
>
> Can Mesa query for the size of the LLC? If so, then I'm considering assigning
> each surface's LLC cacheability per draw call by taking into account the LLC 
> size
> and each surface's size and usage. What do you think? Of course, such a scheme
> still needs wide testing before committing.

Not without using CPUID or parsing /proc/cpuinfo, but given that it's
all heuristics, guessing "about 3MB on this CPU" or whatever sounds fine
to me.

Attachment: pgp20RjJfBd5k.pgp
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to