Re: [PATCH] [amdgcn] Scale number of threads/workers with VGPR usage

Andrew Stubbs Fri, 31 Jan 2020 06:13:27 -0800

On 31/01/2020 13:56, Kwok Cheung Yeung wrote:

The GCN architecture has 4 SIMD units per compute unit, with 256 VGPRsper SIMD unit. OpenMP threads or OpenACC workers must be distributedacross the SIMD units, with each thread/worker fitting entirely within asingle SIMD unit. VGPRs are shared by the kernels running in a SIMDunit, so we can have 4 workers that use up to 256 VGPRs, 8 workers thatuse up to 128 VGPRs, 16 workers that use up to 64 VGPRs and so on.
If more threads/workers are requested than can be supported, then theruntime fails with the message:
libgomp: GCN fatal error: Asynchronous queue error
Runtime message: HSA_STATUS_ERROR_INVALID_ISA: The instruction setarchitecture is invalid.
This patch adds code to mkoffload such that the number of VGPRs (andSGPRs for good measure) requested by a kernel is reported to libgomp atruntime. When launching a kernel, if libgomp detects that the number ofthreads/workers exceeds what can be supported by the hardware, itautomatically scales down the number to the maximum supported value.
This behaviour can be overridden using environment variables to set anexplicit number of threads/workers (GCN_NUM_THREADS/GCN_NUM_WORKERS),but there is not much point IMO as the kernel will just fail to run.
Tested on a GCN3 accelerator with 6 new passes and no regressions notedin libgomp. Okay for trunk?
Kwok

     gcc/
* config/gcn/mkoffload.c (process_asm): Add sgpr_count andvgpr_count to
     definition of hsa_kernel_description.  Parse assembly to find SGPR and
     VGPR count of kernel and store in hsa_kernel_description.

     libgomp/
     * plugin/plugin-gcn.c (struct hsa_kernel_description): Add sgpr_count
     and vgpr_count fields.
     (struct kernel_info): Add a field for a hsa_kernel_description.
     (run_kernel): Reduce the number of threads/workers if the requested
     number would require too many VGPRs.
     (init_basic_kernel_info): Initialize description field with
     the hsa_kernel_description entry for the kernel.


OK.

Andrew

Re: [PATCH] [amdgcn] Scale number of threads/workers with VGPR usage

Reply via email to