Hahnfeld marked an inline comment as done.
Hahnfeld added inline comments.

================
Comment at: include/clang/Basic/TargetInfo.h:944
+  /// \brief Whether target supports variable-length arrays.
+  bool isVLASupported() const { return VLASupported; }
+
----------------
rjmccall wrote:
> Hahnfeld wrote:
> > rjmccall wrote:
> > > ABataev wrote:
> > > > Hahnfeld wrote:
> > > > > rjmccall wrote:
> > > > > > Hahnfeld wrote:
> > > > > > > rjmccall wrote:
> > > > > > > > Hahnfeld wrote:
> > > > > > > > > rjmccall wrote:
> > > > > > > > > > ABataev wrote:
> > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > > > > > The way you've written this makes it sound 
> > > > > > > > > > > > > > > > > > like "does the target support VLAs?", but 
> > > > > > > > > > > > > > > > > > the actual semantic checks treat it as "do 
> > > > > > > > > > > > > > > > > > OpenMP devices on this target support 
> > > > > > > > > > > > > > > > > > VLAs?"  Maybe there should be a more 
> > > > > > > > > > > > > > > > > > specific way to query things about OpenMP 
> > > > > > > > > > > > > > > > > > devices instead of setting a global flag 
> > > > > > > > > > > > > > > > > > for the target?
> > > > > > > > > > > > > > > > > Actually, the NVPTX and SPIR targets never 
> > > > > > > > > > > > > > > > > support VLAs. So I felt like it would be more 
> > > > > > > > > > > > > > > > > correct to make this a global property of the 
> > > > > > > > > > > > > > > > > target.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > The difference is that the other programming 
> > > > > > > > > > > > > > > > > models (OpenCL and CUDA) error out 
> > > > > > > > > > > > > > > > > immediatelyand regardless of the target 
> > > > > > > > > > > > > > > > > because this limitation is reflected in the 
> > > > > > > > > > > > > > > > > standards that disallow VLAs (see 
> > > > > > > > > > > > > > > > > SemaType.cpp). For OpenMP we might have 
> > > > > > > > > > > > > > > > > target devices that support VLA so we 
> > > > > > > > > > > > > > > > > shouldn't error out for those.
> > > > > > > > > > > > > > > > If you want to make it a global property of the 
> > > > > > > > > > > > > > > > target, that's fine, but then I don't 
> > > > > > > > > > > > > > > > understand why your diagnostic only fires when 
> > > > > > > > > > > > > > > > (S.isInOpenMPDeclareTargetContext() || 
> > > > > > > > > > > > > > > > S.isInOpenMPTargetExecutionDirective()).
> > > > > > > > > > > > > > > That is because of how OpenMP offloading works 
> > > > > > > > > > > > > > > and how it is implemented in Clang. Consider the 
> > > > > > > > > > > > > > > following snippet from the added test case:
> > > > > > > > > > > > > > > ```lang=c
> > > > > > > > > > > > > > > int vla[arg];
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > #pragma omp target map(vla[0:arg])
> > > > > > > > > > > > > > > {
> > > > > > > > > > > > > > >    // more code here...
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Clang will take the following steps to compile 
> > > > > > > > > > > > > > > this into a working binary for a GPU:
> > > > > > > > > > > > > > > 1. Parse and (semantically) analyze the code 
> > > > > > > > > > > > > > > as-is for the host and produce LLVM Bitcode.
> > > > > > > > > > > > > > > 2. Parse and analyze again the code as-is and 
> > > > > > > > > > > > > > > generate code for the offloading target, the GPU 
> > > > > > > > > > > > > > > in this case.
> > > > > > > > > > > > > > > 3. Take LLVM Bitcode from 1., generate host 
> > > > > > > > > > > > > > > binary and embed target binary from 3.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > `OpenMPIsDevice` will be true for 2., but the 
> > > > > > > > > > > > > > > complete source code is analyzed. So to not throw 
> > > > > > > > > > > > > > > errors for the host code, we have to make sure 
> > > > > > > > > > > > > > > that we are actually generating code for the 
> > > > > > > > > > > > > > > target device. This is either in a `target` 
> > > > > > > > > > > > > > > directive or in a `declare target` region.
> > > > > > > > > > > > > > > Note that this is quite similar to what CUDA 
> > > > > > > > > > > > > > > does, only they have `CUDADiagIfDeviceCode` for 
> > > > > > > > > > > > > > > this logic. If you want me to add something of 
> > > > > > > > > > > > > > > that kind for OpenMP target devices, I'm fine 
> > > > > > > > > > > > > > > with that. However for the given case, it's a bit 
> > > > > > > > > > > > > > > different because this error should only be 
> > > > > > > > > > > > > > > thrown for target devices that don't support 
> > > > > > > > > > > > > > > VLAs...
> > > > > > > > > > > > > > I see.  So the entire translation unit is re-parsed 
> > > > > > > > > > > > > > and re-Sema'ed from scratch for the target?  Which 
> > > > > > > > > > > > > > means you need to avoid generating errors about 
> > > > > > > > > > > > > > things in the outer translation unit that aren't 
> > > > > > > > > > > > > > part of the target directive that you actually want 
> > > > > > > > > > > > > > to compile.  I would've expected there to be some 
> > > > > > > > > > > > > > existing mechanism for that, to be honest, as 
> > > > > > > > > > > > > > opposed to explicitly trying to suppress 
> > > > > > > > > > > > > > target-specific diagnostics one by one.
> > > > > > > > > > > > > Yes, that is my understanding. For errors, we don't 
> > > > > > > > > > > > > need to take anything special as the first `cc1` 
> > > > > > > > > > > > > invocation will exit with a non-zero status so that 
> > > > > > > > > > > > > the driver stops the compilation. For warnings, there 
> > > > > > > > > > > > > seems to be no mechanism in place as I see them 
> > > > > > > > > > > > > duplicated, even in code that is not generate for the 
> > > > > > > > > > > > > target device (verified with an unused variable).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > @ABataev @gtbercea Do I miss something here?
> > > > > > > > > > > > I'm not aware of any.
> > > > > > > > > > > John, target-specific checks require some special flags 
> > > > > > > > > > > (like LangOpts.Cuda) that are not set when we re-compile 
> > > > > > > > > > > the code for OpenMP devices. That's why errors are not 
> > > > > > > > > > > emitted for the non-target code. But also because of 
> > > > > > > > > > > that, we need some special OpenMP checks for 
> > > > > > > > > > > target-specific code inside the target regions. For 
> > > > > > > > > > > example, code in lib/Sema/SemaType.cpp, lines 2184, 2185 
> > > > > > > > > > > (see this file in this patch) checks for Cuda compilation 
> > > > > > > > > > > and prohibits using of VLAs in Cuda mode. We also should 
> > > > > > > > > > > prohibit using of VLAs in target code for NVPTX devices 
> > > > > > > > > > > or other devices that do not support VLAs in OpenMP mode.
> > > > > > > > > > I think it would be cleaner here, and better for our OpenMP 
> > > > > > > > > > support overall, if we found a more general way to suppress 
> > > > > > > > > > unwanted diagnostics in the second invocation for code 
> > > > > > > > > > outside of the target directive.  This check (and several 
> > > > > > > > > > others) would then just implement a more general target 
> > > > > > > > > > feature disabling VLA support instead of being awkwardly 
> > > > > > > > > > OpenMP-specific.
> > > > > > > > > I think to get this we would need to make `Diag` a no-op `if 
> > > > > > > > > (Context.getLangOpts().OpenMPIsDevice && 
> > > > > > > > > !(isInOpenMPDeclareTargetContext() || 
> > > > > > > > > isInOpenMPTargetExecutionDirective()))`. This would ignore 
> > > > > > > > > all diagnostics outside of the code is really generated in 
> > > > > > > > > the end...
> > > > > > > > I mean, the danger of this approach is that you don't really 
> > > > > > > > want to suppress diagnostics for top-level declarations: it can 
> > > > > > > > leave you with an invalid AST, and it is not valid to generate 
> > > > > > > > IR from an invalid AST.
> > > > > > > > 
> > > > > > > > Sorry for the ignorant questions that follow, but I assume the 
> > > > > > > > OpenMP spec must bless this double-translation somehow, and I'd 
> > > > > > > > like to understand more about that in order to advise how to 
> > > > > > > > proceed.  How does OpenMP handle the possibility that the code 
> > > > > > > > will be processed substantially differently for different 
> > > > > > > > targets?  Is there some rule in the spec saying that the code 
> > > > > > > > has to expand "the same" in both targets?  How does that work 
> > > > > > > > when e.g. size_t might have a different size or use a 
> > > > > > > > completely different type?  More generally, how do expect that 
> > > > > > > > this feature will work in the more complicated language modes, 
> > > > > > > > like OpenMP + C++?
> > > > > > > So if there is an error, the analysis will already fail on the 
> > > > > > > host. I think that guarantees that we don't end up with an 
> > > > > > > invalid AST and will at most suppress duplicate warnings.
> > > > > > > 
> > > > > > > Regarding the OpenMP spec: I think the unsatisfying answer is 
> > > > > > > that the spec doesn't say what it expects on that questions. So I 
> > > > > > > think the compiler has to do what seems reasonable...
> > > > > > Well, what I'm worried about is the possibility that something 
> > > > > > changes about the translation unit when it's reprocessed for the 
> > > > > > target — e.g. there's a target-dependent #if that causes an error 
> > > > > > in the target TU, but not in the original TU, so that the 
> > > > > > suppressed error is the only reason that the build fails.
> > > > > > 
> > > > > > If the spec is unclear about this, then we just have to muddle 
> > > > > > through.  Is this "reparse the whole translation unit for the 
> > > > > > target" the prevailing implementation technique for target 
> > > > > > directives?
> > > > > If we are worried about that scenario we have the preserve the 
> > > > > current state: Do nothing, diagnose everything and let the user 
> > > > > figure out if there is an error in the code.
> > > > > 
> > > > > I can't really comment on what other compilers (GCC, Intel) do, but 
> > > > > at least for GCC you compile a complete compiler for the target, so I 
> > > > > suppose they kind of do the same...
> > > > John, reparsing is required to compile the target-specific code for the 
> > > > particular device. This is an intended behavior. Otherwise we may use 
> > > > some host-specific code on the device. Reparsing allows to reinclude 
> > > > all includes so that the code uses all the definitions for the target 
> > > > instead of those used for the host.
> > > > Intel and gcc are doing the same
> > > Okay, thanks, that's all I was asking.
> > > 
> > > I am still concerned that this is going to gradually grow to encompass 
> > > basically every "XX feature is disabled on this target" diagnostic.  We 
> > > could solve this more elegantly by tagging such diagnostics in the .td 
> > > file and then introducing a diagnostic filter that dropped such 
> > > diagnostics except when we're inside a target directive.
> > > 
> > > Even if you don't do that, I'm going to insist that you rewrite the test 
> > > in SemaType so that it actually fires in normal code if the target 
> > > disables VLAs.  The OpenMP-specific logic should be to suppress the 
> > > diagnostic if we happen to be in target mode and not within a target 
> > > directive.
> > > 
> > > Also, it seems to me that the "are we inside a target directive" check is 
> > > inadequate to deal with lazy-code-emission features like static functions 
> > > in C and templates in C++.  The diagnostic must be emitted for a 
> > > violation in a function body if and only if the function is used from 
> > > within the target directive.  That is, unless there's some very coarse 
> > > language restriction like forbidding external uses in target directives?
> > > 
> > > Also, shouldn't the "are we in a target directive" be checking that we're 
> > > in a *matching* target directive?  Surely a single OpenMP file can 
> > > contain target directives for multiple external targets?
> > Ok, so going forward you suggest something like the following?
> > ```lang=C++
> >   // Some targets don't support VLAs.
> >   if (!Context.getTargetInfo().isVLASupported() && T->isVariableArrayType() 
> > &&
> >       (!getLangOpts().OpenMPIsDevice || isInOpenMPDeclareTargetContext() ||
> >       isInOpenMPTargetExecutionDirective())) {
> >     Diag(Loc, diag::err_target_no_vla_support);
> >     return QualType();
> >   }
> > ```
> > 
> > Doesn't lazy-code emission check the code semantically? Yes, the OpenMP 
> > standard (currently) mandates that all external functions that are called 
> > must be in a `declare target` region. And all regions in a file are 
> > compiled for all targets, so there shouldn't be a "non-matching" target 
> > directive.
> I would check for a VAT before checking whether VLAs are supported, but yes, 
> that's the right idea.  Please also consider just adding this check to where 
> we build the VAT in the first place.
> 
> We don't (typically) do any extra semantic checks on a function body when 
> it's first used, but if the standard requires all functions that are called 
> to be in a target region, that should be good enough.  And ok, I didn't 
> understand that about targets; sounds good.
I think I've implemented what we discussed here, please let me know if you 
disagree. I've also tried to add tests with templates, but `declare target` is 
currently broken for that use case, see 
https://bugs.llvm.org/show_bug.cgi?id=35348.


https://reviews.llvm.org/D39505



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to