tra added a comment.

TL; DR; 
+1 to formalizing how we want -M*/-E/-S/-emit-llvm/-fsyntax-only to behave. 
OK with -M/-E/-S defaulting to host, and erroring out if applied to multiple 
sub-compilations.
I'm still convinced that the tooling issue with multiple subcompilations is 
orthogonal to this change and should be handled in libclang and that 
-fsyntax-only should not default to one sub-compilation.

See the details below.

In D68587#1698777 <https://reviews.llvm.org/D68587#1698777>, @hliao wrote:

> In D68587#1698247 <https://reviews.llvm.org/D68587#1698247>, @tra wrote:
>
> > In D68587#1698102 <https://reviews.llvm.org/D68587#1698102>, @hliao wrote:
> >
> > > for most compilation tools, single input and single output are expected. 
> > > Without assuming `-fsyntax-only` alone is host-compilation only, that at 
> > > least run syntax checking twice.
> >
> >
> > I believe the driver will not run subsequent jobs if one of the device 
> > compilations fails. You may see duplicate warnings from multiple stages, 
> > but overall the error handling works in a fairly predictable way now.
>
>
> It still runs and gives the same error (if that error is applicable to both 
> sides) at least twice if you just specify `-fsyntax-only` or `-E`. That won't 
> happen for regular compilation option (`-c`) due to the additional device 
> dependencies added.


Are you talking about the behavior with or without  this patch?

Without the patch I do see the driver stopping as soon as one of the 
subcompilations errors out and it needs *all* subcompilations to succeed in 
order for the whole invocation to succeed.

E.g.: err.cu:

  common error;
  
  #if __CUDA_ARCH__
  __device__ void foo() {
    device error;
  }
  #else
  __host__ void bar() {
    host error;
  }
  #endif



  $ bin/clang --cuda-path=$HOME/local/cuda-10.0 --cuda-gpu-arch=sm_30 
-fsyntax-only err.cu
  err.cu:1:1: error: unknown type name 'common'
  common error;
  ^
  err.cu:9:3: error: unknown type name 'host'
    host error;
    ^
  2 errors generated when compiling for host.

If I comment out the host and common errors, then clang finishes host 
compilation, moves on to device-side compilation and reports the error there:

  err.cu:5:3: error: unknown type name 'device'; did you mean 'CUdevice'?
    device error;
    ^~~~~~
    CUdevice
  /usr/local/google/home/tra/local/cuda-10.0/include/cuda.h:252:13: note: 
'CUdevice' declared here
  typedef int CUdevice;                                     /**< CUDA device */
              ^
  1 error generated when compiling for sm_30.

Again, it reports that there's an error on device side. My understanding is 
that with this patch clang would've succeeded, which is, IMO, incorrect.

> The error itself is, in fact, should be clear enough, the most confusing part 
> is the diagnostic message and suggestions from clang as host- and device-side 
> compilations are quite different, especially the error message may be mixed 
> with other-side the normal output.

Mixing errors and output will be confusing no matter what you do. That's why 
diags go to stderr by default, so you can separate them from the regular output.
As far as errors go, clang clearly labels which side of the compilation 
produced them:

`1 error generated when compiling for sm_30.`

>> To think of it, I'm starting to doubt that this patch is an improvement for 
>> `-M` either. You will get the dependencies for the host, but they are not 
>> necessarily the same as the dependencies for the device-side compilation. 
>> Producing a partial list of dependencies will be potentially incorrect. IMO 
>> we do need dependencies info from all sub-compilations.
> 
> Even without this patch, `-M` or more specifically `-MD` already breaks now 
> as we just run the dependency generation action twice for each side. The 
> later will overwrite the former *.d file. We need special handling of `-M` to 
> match nvcc.

This sounds like a bug to me. We should've refused to write multiple files -- 
similar to how we refuse to preprocess if we have more than one sub-compilation.
OK. In this sense defaulting to host-only here would be a marginal improvement. 
Until we can produce complete dependency info for all sub-compilations letting 
user specifically select one (with host being default) is probably the best 
choice we can make at the moment.

> 
> 
>> Perhaps we should limit the scope of this patch to -E only for now?
> 
> Just found nvcc's `-E` returns the output of the device-side compilation for 
> the first GPU arch specified. Anyway, whether to match that behavior is just 
> another question.

NVCC is not a good guideline here. Considering that they do source code 
transformation, it's not quite clear what exactly -E means for NVCC at all.
Defaulting to host-side -E would probably be fine as long as user can override 
it with --cuda-device-only.

> but some tools, like clang-tidy, may be found difficult to insert that option 
> properly, says `clang-tidy -p` supposes to read the compilation command 
> databased generated by cmake or meta-build systems and performs additional 
> checks for sources. Adding that option may inevitably make them CUDA/HIP 
> aware.

I'm not sure I understand your argument here. Clang driver does create multiple 
sub-compilations for CUDA/HIP. That's the reality. The tools that use the 
library commonly assume that 1 compilation == 1 clang action, which is not 
valid for CUDA. Our options are:

- Let user specify which subcompilation they want. This requires passing extra 
flags and complicates their use, especially for tools that rely on using the 
exact flags used during normal build.
- Teach the tools to deal with multiple actions. This will need changing all 
tools. Probably the best way long-term, but does not help much now.
- Let libclang pick a sensible default, but allow the tool override it if they 
need it. This would be preferred way, IMO. This keeps current tools working and 
allows future tools to dig into all sub-compilations if they wish.
- Let clang driver guess that the user would only need one sub-compilation 
(e.g. assuming that the tools would commonly use -fsyntax-only). This, IMO, not 
the right place to deal with tooling issues and it does result in what I think 
is incorrect/unexpected output for normal use of -fsyntax-only in clang.



>> If that's still a problem, then we should change the tooling infrastructure 
>> to use host-only compilation for HIP and CUDA by default.
> 
> That's an option I was looking into as well. But, generally speaking, we need 
> a clear definition on expected output for options like `-M`, `-MD`, `-E`, 
> `-fsyntax-only`, `-S -emit-llvm`, even `-S` (for HIP only.)

I think we have a decent understanding of what we want (even if clang does not 
quite do the right thing ATM as you've pointed above regarding `-M`).

- `-M`, `-MD`, `-E`, `-S -emit-llvm`, even `-S`

  Require unambiguous single sub-compilation. It could either be explicitly 
specified by user (--cuda-host-only/cuda-device-only w/ single GPU arch), or 
have a reasonable default (host?)

- `-fsyntax-only`

  Should succeed only if all sub-compilations succeeded. Error reporting stops 
after one of sub-compilations fail with clear indication of which side produced 
the errors.

Does this make sense?

>> Also, this patch makes it impossible to run -fsyntax-only on *all* 
>> sub-compilations at once. I will have to run `clang -fsyntax-only` multiple 
>> times -- once per host and once per each device.
> 
> We do have another option `-cuda-compile-host-device` to explicit ask for 
> host- and device-side compilations.

Does it work with this patch? IIRC, `--cuda-compile-host-device` would only 
override `--cuda-host-only/--cuda-device-only` flags and with this patch we'd 
still end up with the host compilation only.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68587/new/

https://reviews.llvm.org/D68587



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to