[PATCH] D39505: [OpenMP] Show error if VLAs are not supported

Jonas Hahnfeld via Phabricator via cfe-commits Wed, 15 Nov 2017 07:15:57 -0800

Hahnfeld added inline comments.


================
Comment at: include/clang/Basic/TargetInfo.h:944
+  /// \brief Whether target supports variable-length arrays.
+  bool isVLASupported() const { return VLASupported; }
+
----------------
rjmccall wrote:
> ABataev wrote:
> > Hahnfeld wrote:
> > > rjmccall wrote:
> > > > Hahnfeld wrote:
> > > > > rjmccall wrote:
> > > > > > Hahnfeld wrote:
> > > > > > > rjmccall wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > ABataev wrote:
> > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > Hahnfeld wrote:
> > > > > > > > > > > > > > > rjmccall wrote:
> > > > > > > > > > > > > > > > The way you've written this makes it sound like 
> > > > > > > > > > > > > > > > "does the target support VLAs?", but the actual 
> > > > > > > > > > > > > > > > semantic checks treat it as "do OpenMP devices 
> > > > > > > > > > > > > > > > on this target support VLAs?"  Maybe there 
> > > > > > > > > > > > > > > > should be a more specific way to query things 
> > > > > > > > > > > > > > > > about OpenMP devices instead of setting a 
> > > > > > > > > > > > > > > > global flag for the target?
> > > > > > > > > > > > > > > Actually, the NVPTX and SPIR targets never 
> > > > > > > > > > > > > > > support VLAs. So I felt like it would be more 
> > > > > > > > > > > > > > > correct to make this a global property of the 
> > > > > > > > > > > > > > > target.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > The difference is that the other programming 
> > > > > > > > > > > > > > > models (OpenCL and CUDA) error out immediatelyand 
> > > > > > > > > > > > > > > regardless of the target because this limitation 
> > > > > > > > > > > > > > > is reflected in the standards that disallow VLAs 
> > > > > > > > > > > > > > > (see SemaType.cpp). For OpenMP we might have 
> > > > > > > > > > > > > > > target devices that support VLA so we shouldn't 
> > > > > > > > > > > > > > > error out for those.
> > > > > > > > > > > > > > If you want to make it a global property of the 
> > > > > > > > > > > > > > target, that's fine, but then I don't understand 
> > > > > > > > > > > > > > why your diagnostic only fires when 
> > > > > > > > > > > > > > (S.isInOpenMPDeclareTargetContext() || 
> > > > > > > > > > > > > > S.isInOpenMPTargetExecutionDirective()).
> > > > > > > > > > > > > That is because of how OpenMP offloading works and 
> > > > > > > > > > > > > how it is implemented in Clang. Consider the 
> > > > > > > > > > > > > following snippet from the added test case:
> > > > > > > > > > > > > ```lang=c
> > > > > > > > > > > > > int vla[arg];
> > > > > > > > > > > > > 
> > > > > > > > > > > > > #pragma omp target map(vla[0:arg])
> > > > > > > > > > > > > {
> > > > > > > > > > > > >    // more code here...
> > > > > > > > > > > > > }
> > > > > > > > > > > > > ```
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Clang will take the following steps to compile this 
> > > > > > > > > > > > > into a working binary for a GPU:
> > > > > > > > > > > > > 1. Parse and (semantically) analyze the code as-is 
> > > > > > > > > > > > > for the host and produce LLVM Bitcode.
> > > > > > > > > > > > > 2. Parse and analyze again the code as-is and 
> > > > > > > > > > > > > generate code for the offloading target, the GPU in 
> > > > > > > > > > > > > this case.
> > > > > > > > > > > > > 3. Take LLVM Bitcode from 1., generate host binary 
> > > > > > > > > > > > > and embed target binary from 3.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > `OpenMPIsDevice` will be true for 2., but the 
> > > > > > > > > > > > > complete source code is analyzed. So to not throw 
> > > > > > > > > > > > > errors for the host code, we have to make sure that 
> > > > > > > > > > > > > we are actually generating code for the target 
> > > > > > > > > > > > > device. This is either in a `target` directive or in 
> > > > > > > > > > > > > a `declare target` region.
> > > > > > > > > > > > > Note that this is quite similar to what CUDA does, 
> > > > > > > > > > > > > only they have `CUDADiagIfDeviceCode` for this logic. 
> > > > > > > > > > > > > If you want me to add something of that kind for 
> > > > > > > > > > > > > OpenMP target devices, I'm fine with that. However 
> > > > > > > > > > > > > for the given case, it's a bit different because this 
> > > > > > > > > > > > > error should only be thrown for target devices that 
> > > > > > > > > > > > > don't support VLAs...
> > > > > > > > > > > > I see.  So the entire translation unit is re-parsed and 
> > > > > > > > > > > > re-Sema'ed from scratch for the target?  Which means 
> > > > > > > > > > > > you need to avoid generating errors about things in the 
> > > > > > > > > > > > outer translation unit that aren't part of the target 
> > > > > > > > > > > > directive that you actually want to compile.  I 
> > > > > > > > > > > > would've expected there to be some existing mechanism 
> > > > > > > > > > > > for that, to be honest, as opposed to explicitly trying 
> > > > > > > > > > > > to suppress target-specific diagnostics one by one.
> > > > > > > > > > > Yes, that is my understanding. For errors, we don't need 
> > > > > > > > > > > to take anything special as the first `cc1` invocation 
> > > > > > > > > > > will exit with a non-zero status so that the driver stops 
> > > > > > > > > > > the compilation. For warnings, there seems to be no 
> > > > > > > > > > > mechanism in place as I see them duplicated, even in code 
> > > > > > > > > > > that is not generate for the target device (verified with 
> > > > > > > > > > > an unused variable).
> > > > > > > > > > > 
> > > > > > > > > > > @ABataev @gtbercea Do I miss something here?
> > > > > > > > > > I'm not aware of any.
> > > > > > > > > John, target-specific checks require some special flags (like 
> > > > > > > > > LangOpts.Cuda) that are not set when we re-compile the code 
> > > > > > > > > for OpenMP devices. That's why errors are not emitted for the 
> > > > > > > > > non-target code. But also because of that, we need some 
> > > > > > > > > special OpenMP checks for target-specific code inside the 
> > > > > > > > > target regions. For example, code in lib/Sema/SemaType.cpp, 
> > > > > > > > > lines 2184, 2185 (see this file in this patch) checks for 
> > > > > > > > > Cuda compilation and prohibits using of VLAs in Cuda mode. We 
> > > > > > > > > also should prohibit using of VLAs in target code for NVPTX 
> > > > > > > > > devices or other devices that do not support VLAs in OpenMP 
> > > > > > > > > mode.
> > > > > > > > I think it would be cleaner here, and better for our OpenMP 
> > > > > > > > support overall, if we found a more general way to suppress 
> > > > > > > > unwanted diagnostics in the second invocation for code outside 
> > > > > > > > of the target directive.  This check (and several others) would 
> > > > > > > > then just implement a more general target feature disabling VLA 
> > > > > > > > support instead of being awkwardly OpenMP-specific.
> > > > > > > I think to get this we would need to make `Diag` a no-op `if 
> > > > > > > (Context.getLangOpts().OpenMPIsDevice && 
> > > > > > > !(isInOpenMPDeclareTargetContext() || 
> > > > > > > isInOpenMPTargetExecutionDirective()))`. This would ignore all 
> > > > > > > diagnostics outside of the code is really generated in the end...
> > > > > > I mean, the danger of this approach is that you don't really want 
> > > > > > to suppress diagnostics for top-level declarations: it can leave 
> > > > > > you with an invalid AST, and it is not valid to generate IR from an 
> > > > > > invalid AST.
> > > > > > 
> > > > > > Sorry for the ignorant questions that follow, but I assume the 
> > > > > > OpenMP spec must bless this double-translation somehow, and I'd 
> > > > > > like to understand more about that in order to advise how to 
> > > > > > proceed.  How does OpenMP handle the possibility that the code will 
> > > > > > be processed substantially differently for different targets?  Is 
> > > > > > there some rule in the spec saying that the code has to expand "the 
> > > > > > same" in both targets?  How does that work when e.g. size_t might 
> > > > > > have a different size or use a completely different type?  More 
> > > > > > generally, how do expect that this feature will work in the more 
> > > > > > complicated language modes, like OpenMP + C++?
> > > > > So if there is an error, the analysis will already fail on the host. 
> > > > > I think that guarantees that we don't end up with an invalid AST and 
> > > > > will at most suppress duplicate warnings.
> > > > > 
> > > > > Regarding the OpenMP spec: I think the unsatisfying answer is that 
> > > > > the spec doesn't say what it expects on that questions. So I think 
> > > > > the compiler has to do what seems reasonable...
> > > > Well, what I'm worried about is the possibility that something changes 
> > > > about the translation unit when it's reprocessed for the target — e.g. 
> > > > there's a target-dependent #if that causes an error in the target TU, 
> > > > but not in the original TU, so that the suppressed error is the only 
> > > > reason that the build fails.
> > > > 
> > > > If the spec is unclear about this, then we just have to muddle through. 
> > > >  Is this "reparse the whole translation unit for the target" the 
> > > > prevailing implementation technique for target directives?
> > > If we are worried about that scenario we have the preserve the current 
> > > state: Do nothing, diagnose everything and let the user figure out if 
> > > there is an error in the code.
> > > 
> > > I can't really comment on what other compilers (GCC, Intel) do, but at 
> > > least for GCC you compile a complete compiler for the target, so I 
> > > suppose they kind of do the same...
> > John, reparsing is required to compile the target-specific code for the 
> > particular device. This is an intended behavior. Otherwise we may use some 
> > host-specific code on the device. Reparsing allows to reinclude all 
> > includes so that the code uses all the definitions for the target instead 
> > of those used for the host.
> > Intel and gcc are doing the same
> Okay, thanks, that's all I was asking.
> 
> I am still concerned that this is going to gradually grow to encompass 
> basically every "XX feature is disabled on this target" diagnostic.  We could 
> solve this more elegantly by tagging such diagnostics in the .td file and 
> then introducing a diagnostic filter that dropped such diagnostics except 
> when we're inside a target directive.
> 
> Even if you don't do that, I'm going to insist that you rewrite the test in 
> SemaType so that it actually fires in normal code if the target disables 
> VLAs.  The OpenMP-specific logic should be to suppress the diagnostic if we 
> happen to be in target mode and not within a target directive.
> 
> Also, it seems to me that the "are we inside a target directive" check is 
> inadequate to deal with lazy-code-emission features like static functions in 
> C and templates in C++.  The diagnostic must be emitted for a violation in a 
> function body if and only if the function is used from within the target 
> directive.  That is, unless there's some very coarse language restriction 
> like forbidding external uses in target directives?
> 
> Also, shouldn't the "are we in a target directive" be checking that we're in 
> a *matching* target directive?  Surely a single OpenMP file can contain 
> target directives for multiple external targets?
Ok, so going forward you suggest something like the following?
```lang=C++
  // Some targets don't support VLAs.
  if (!Context.getTargetInfo().isVLASupported() && T->isVariableArrayType() &&
      (!getLangOpts().OpenMPIsDevice || isInOpenMPDeclareTargetContext() ||
      isInOpenMPTargetExecutionDirective())) {
    Diag(Loc, diag::err_target_no_vla_support);
    return QualType();
  }
```

Doesn't lazy-code emission check the code semantically? Yes, the OpenMP 
standard (currently) mandates that all external functions that are called must 
be in a `declare target` region. And all regions in a file are compiled for all 
targets, so there shouldn't be a "non-matching" target directive.


https://reviews.llvm.org/D39505



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D39505: [OpenMP] Show error if VLAs are not supported

Reply via email to