https://llvm.org/bugs/show_bug.cgi?id=31321

            Bug ID: 31321
           Summary: Halide would like to set the .maxnreg directive per
                    PTX kernel.
           Product: libraries
           Version: trunk
          Hardware: Other
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: PTX
          Assignee: unassignedb...@nondot.org
          Reporter: zal...@google.com
                CC: llvm-bugs@lists.llvm.org
    Classification: Unclassified

The PTX backend has the ability to generate certain per kernel (.entry) PTX
directives via metadata annotations. The test for this is in
test/CodeGen/NVPTX/annotations.ll . According to this NVIDIA PTX document:
   
http://docs.nvidia.com/cuda/parallel-thread-execution/#performance-tuning-directives

the .maxnreg value can be set on a per entry basis as well. Halide would like
to exploit this to be able to provide a scheduling directive to control this
value. (See:
    https://github.com/halide/Halide/pull/1667
where there is a hack to set the maximum number of registers on a per module
basis at load time.)

Plumbing this through involves adding support to
NVPTXAsmPrinter::emitKernelFunctionDirectives .

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to