Hahnfeld added inline comments.

================
Comment at: include/clang/Basic/TargetInfo.h:944
+  /// \brief Whether target supports variable-length arrays.
+  bool isVLASupported() const { return VLASupported; }
+
----------------
rjmccall wrote:
> Hahnfeld wrote:
> > rjmccall wrote:
> > > ABataev wrote:
> > > > ABataev wrote:
> > > > > Hahnfeld wrote:
> > > > > > rjmccall wrote:
> > > > > > > Hahnfeld wrote:
> > > > > > > > rjmccall wrote:
> > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > The way you've written this makes it sound like "does the 
> > > > > > > > > > > target support VLAs?", but the actual semantic checks 
> > > > > > > > > > > treat it as "do OpenMP devices on this target support 
> > > > > > > > > > > VLAs?"  Maybe there should be a more specific way to 
> > > > > > > > > > > query things about OpenMP devices instead of setting a 
> > > > > > > > > > > global flag for the target?
> > > > > > > > > > Actually, the NVPTX and SPIR targets never support VLAs. So 
> > > > > > > > > > I felt like it would be more correct to make this a global 
> > > > > > > > > > property of the target.
> > > > > > > > > > 
> > > > > > > > > > The difference is that the other programming models (OpenCL 
> > > > > > > > > > and CUDA) error out immediatelyand regardless of the target 
> > > > > > > > > > because this limitation is reflected in the standards that 
> > > > > > > > > > disallow VLAs (see SemaType.cpp). For OpenMP we might have 
> > > > > > > > > > target devices that support VLA so we shouldn't error out 
> > > > > > > > > > for those.
> > > > > > > > > If you want to make it a global property of the target, 
> > > > > > > > > that's fine, but then I don't understand why your diagnostic 
> > > > > > > > > only fires when (S.isInOpenMPDeclareTargetContext() || 
> > > > > > > > > S.isInOpenMPTargetExecutionDirective()).
> > > > > > > > That is because of how OpenMP offloading works and how it is 
> > > > > > > > implemented in Clang. Consider the following snippet from the 
> > > > > > > > added test case:
> > > > > > > > ```lang=c
> > > > > > > > int vla[arg];
> > > > > > > > 
> > > > > > > > #pragma omp target map(vla[0:arg])
> > > > > > > > {
> > > > > > > >    // more code here...
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Clang will take the following steps to compile this into a 
> > > > > > > > working binary for a GPU:
> > > > > > > > 1. Parse and (semantically) analyze the code as-is for the host 
> > > > > > > > and produce LLVM Bitcode.
> > > > > > > > 2. Parse and analyze again the code as-is and generate code for 
> > > > > > > > the offloading target, the GPU in this case.
> > > > > > > > 3. Take LLVM Bitcode from 1., generate host binary and embed 
> > > > > > > > target binary from 3.
> > > > > > > > 
> > > > > > > > `OpenMPIsDevice` will be true for 2., but the complete source 
> > > > > > > > code is analyzed. So to not throw errors for the host code, we 
> > > > > > > > have to make sure that we are actually generating code for the 
> > > > > > > > target device. This is either in a `target` directive or in a 
> > > > > > > > `declare target` region.
> > > > > > > > Note that this is quite similar to what CUDA does, only they 
> > > > > > > > have `CUDADiagIfDeviceCode` for this logic. If you want me to 
> > > > > > > > add something of that kind for OpenMP target devices, I'm fine 
> > > > > > > > with that. However for the given case, it's a bit different 
> > > > > > > > because this error should only be thrown for target devices 
> > > > > > > > that don't support VLAs...
> > > > > > > I see.  So the entire translation unit is re-parsed and 
> > > > > > > re-Sema'ed from scratch for the target?  Which means you need to 
> > > > > > > avoid generating errors about things in the outer translation 
> > > > > > > unit that aren't part of the target directive that you actually 
> > > > > > > want to compile.  I would've expected there to be some existing 
> > > > > > > mechanism for that, to be honest, as opposed to explicitly trying 
> > > > > > > to suppress target-specific diagnostics one by one.
> > > > > > Yes, that is my understanding. For errors, we don't need to take 
> > > > > > anything special as the first `cc1` invocation will exit with a 
> > > > > > non-zero status so that the driver stops the compilation. For 
> > > > > > warnings, there seems to be no mechanism in place as I see them 
> > > > > > duplicated, even in code that is not generate for the target device 
> > > > > > (verified with an unused variable).
> > > > > > 
> > > > > > @ABataev @gtbercea Do I miss something here?
> > > > > I'm not aware of any.
> > > > John, target-specific checks require some special flags (like 
> > > > LangOpts.Cuda) that are not set when we re-compile the code for OpenMP 
> > > > devices. That's why errors are not emitted for the non-target code. But 
> > > > also because of that, we need some special OpenMP checks for 
> > > > target-specific code inside the target regions. For example, code in 
> > > > lib/Sema/SemaType.cpp, lines 2184, 2185 (see this file in this patch) 
> > > > checks for Cuda compilation and prohibits using of VLAs in Cuda mode. 
> > > > We also should prohibit using of VLAs in target code for NVPTX devices 
> > > > or other devices that do not support VLAs in OpenMP mode.
> > > I think it would be cleaner here, and better for our OpenMP support 
> > > overall, if we found a more general way to suppress unwanted diagnostics 
> > > in the second invocation for code outside of the target directive.  This 
> > > check (and several others) would then just implement a more general 
> > > target feature disabling VLA support instead of being awkwardly 
> > > OpenMP-specific.
> > I think to get this we would need to make `Diag` a no-op `if 
> > (Context.getLangOpts().OpenMPIsDevice && !(isInOpenMPDeclareTargetContext() 
> > || isInOpenMPTargetExecutionDirective()))`. This would ignore all 
> > diagnostics outside of the code is really generated in the end...
> I mean, the danger of this approach is that you don't really want to suppress 
> diagnostics for top-level declarations: it can leave you with an invalid AST, 
> and it is not valid to generate IR from an invalid AST.
> 
> Sorry for the ignorant questions that follow, but I assume the OpenMP spec 
> must bless this double-translation somehow, and I'd like to understand more 
> about that in order to advise how to proceed.  How does OpenMP handle the 
> possibility that the code will be processed substantially differently for 
> different targets?  Is there some rule in the spec saying that the code has 
> to expand "the same" in both targets?  How does that work when e.g. size_t 
> might have a different size or use a completely different type?  More 
> generally, how do expect that this feature will work in the more complicated 
> language modes, like OpenMP + C++?
So if there is an error, the analysis will already fail on the host. I think 
that guarantees that we don't end up with an invalid AST and will at most 
suppress duplicate warnings.

Regarding the OpenMP spec: I think the unsatisfying answer is that the spec 
doesn't say what it expects on that questions. So I think the compiler has to 
do what seems reasonable...


https://reviews.llvm.org/D39505



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to