https://llvm.org/bugs/show_bug.cgi?id=31321
Bug ID: 31321 Summary: Halide would like to set the .maxnreg directive per PTX kernel. Product: libraries Version: trunk Hardware: Other OS: All Status: NEW Severity: enhancement Priority: P Component: Backend: PTX Assignee: unassignedb...@nondot.org Reporter: zal...@google.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified The PTX backend has the ability to generate certain per kernel (.entry) PTX directives via metadata annotations. The test for this is in test/CodeGen/NVPTX/annotations.ll . According to this NVIDIA PTX document: http://docs.nvidia.com/cuda/parallel-thread-execution/#performance-tuning-directives the .maxnreg value can be set on a per entry basis as well. Halide would like to exploit this to be able to provide a scheduling directive to control this value. (See: https://github.com/halide/Halide/pull/1667 where there is a hack to set the maximum number of registers on a per module basis at load time.) Plumbing this through involves adding support to NVPTXAsmPrinter::emitKernelFunctionDirectives . -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs