Hi all,
On 09.08.21 20:53, Gerald Pfeifer wrote:
(Is "CU" a sufficiently established term, or might it make sense
to spell it out?)
I don't know – but we could use "per compute unit (CU)".
On 09.08.21 16:27, Thomas Schwinge wrote:
On 2021-08-09T15:55:07+0200, Tobias Burnus<tob...@codesourcery.com> wrote:
+++ b/htdocs/gcc-12/changes.html
+ <li>When used as OpenACC device: the limitation of 1 worker per gang, 2 gangs
+ per CU has been lifted; now up to 16 workers per gang and 40 gangs per CU
+ are supported. (Except that the hardware limit of 40 workers total may
+ not be exceeded.)</li>
I haven't changed anything related to a "limitation of [...] 2 gangs per
CU has been lifted". Maybe that has already been done earlier, maybe
that still has to be done? I don't know -- Julian?
Looking at the current code, it has:
if (dims[0] == 0) dims[0] = get_cu_count (kernel->agent); /* Gangs. */
Thus at least when nothing else has been specified, it uses #CUs of gangs,
running on #CUs CUs, i.e. 1 gang per CU.
[OG11 – but not mainline] What's needed is something like:
dims[0] = get_cu_count (kernel->agent) * (32 / dims[1]);
which I see in OG11 – oddly, I also see there code like:
def->gdims[0] = get_cu_count (agent); // * (40 / gcn_threads);
In other words: For gangs > #CUs or >1 gang per CU, the following patch
is needed:
[OG11] https://gcc.gnu.org/g:4dcd1e1f4e6b451aac44f919b8eb3ac49292b308
[email] https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550102.html
"not suitable for mainline until the multiple-worker support is merged
there"
@Andrew + @Julian: Do you intent to commit it relatively soon?
Regarding the wwwdocs patch, I can hold off until that commit or reword
it to only cover the workers part.
Thanks Thomas & Gerald for the comments!
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
München, HRB 106955