On Thu, Sep 6, 2018 at 5:54 PM, Roland Scheidegger <srol...@vmware.com> wrote:
> Am 06.09.2018 um 22:56 schrieb Axel Davy:
>> Yeah by pinning to cores, I meant to group of cores.
>>
>> I think a reasonable policy would be for the kernel to put all threads
>> of a given process on the same L3
>> as long as the number of threads is lower than the L3 group size.
>> When there is more threads I guess it'd need heuristics to pick which
>> threads to put together.
>>
>> I fear if we begin to do the work manually, there won't be interest to
>> do that in the kernel,
>> and thus all applications will need to include such core pinning code to
>> have good performance when
>> multithreaded.
>
> I think the problem here is also that not all cores are equal. Depending
> what your threads do, it might be preferable to keep your 8 threads on 4
> cores (as there's 8 logical cores) sharing the same L3 - but if they are
> just independent threads doing heavy math it might well be preferable to
> spread them out to a different CCX (at least as long as there aren't any
> more threads which also need to run simultaneously).
> And then you also have things like cpus with physical numa topologies
> where only some cores have access to local memory and so on, which also
> should influence placement of threads.
> I have no idea if the kernel does something reasonable here, but I don't
> think it will be able to find a (near) optimal solution without at least
> some help from userspace.

The kernel puts Mesa threads on different CCXs 95% of time. It kinda
makes sense for independent workloads, but not when reference counting
is involved, because atomics are really really REALLY slow between
CCXs. Like that patch for pipe_reference where removing p_atomic_read
in the asserts increased performance by 40% for radeonsi. You can't
make this up.

Marek
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to