[PATCH] D116583: Change the default optimisation level of PTXAS from -O0 to -O3. This makes the optimisation levels of PTXAS and the ptxjitcompiler equal (ptxjitcompiler defaults to -O3).

Hugh Delaney via Phabricator via cfe-commits Wed, 05 Jan 2022 02:34:36 -0800

hdelan added inline comments.


================
Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:433
   } else {
-    // If no -O was passed, pass -O0 to ptxas -- no opt flag should correspond
-    // to no optimizations, but ptxas's default is -O3.
-    CmdArgs.push_back("-O0");
+    // If no -O was passed, pass -O3 to ptxas -- this makes ptxas's
+    // optimization level the same as the ptxjitcompiler.
----------------
tra wrote:
> I think this would be contrary to the expectation that lack of `-O` in clang 
> means - `do not optimize` and it generally implies the whole compilation 
> chain, including assembler. Matching whatever nvidia tools do is an 
> insufficient reason for breaking this assumption, IMO. 
> 
> If you do want do run optimized ptxas on unoptimized PTX, you can use 
> `-Xcuda-ptxas -O3`.
I think for the average user, consistency across the `ptxjitcompiler` and 
`ptxas` is far more important than assuming that no `-O` means no optimization. 
I think most users will assume that no `-O` will assume that whatever tools 
being used will take their default optimization level, which in the case of 
clang is `-O0` and in the case of `ptxas` is `-O3`. 

We have had a few bugs with `ptxas`/`ptxjitcompiler` at higher optimization 
levels, which were quite hard to pin down since offline `ptxas` and 
`ptxjitcompiler` were using different optimisation levels, making bugs appear 
in one and not the other. Of course we are aware of this now but this 
inconsistency can result in bugs that are difficult to diagnose. Having 
consistency between the `ptxjitcompiler` and `ptxas` is therefore of practical 
benefit. Whereas if we are to leave it as is, with `ptxas` defaulting to `-O0`, 
the benefit is purely semantic and not practical.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116583/new/

https://reviews.llvm.org/D116583

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116583: Change the default optimisation level of PTXAS from -O0 to -O3. This makes the optimisation levels of PTXAS and the ptxjitcompiler equal (ptxjitcompiler defaults to -O3).

Reply via email to