Hahnfeld added inline comments.
================ Comment at: include/clang/Basic/TargetInfo.h:944 + /// \brief Whether target supports variable-length arrays. + bool isVLASupported() const { return VLASupported; } + ---------------- rjmccall wrote: > ABataev wrote: > > Hahnfeld wrote: > > > rjmccall wrote: > > > > Hahnfeld wrote: > > > > > rjmccall wrote: > > > > > > Hahnfeld wrote: > > > > > > > rjmccall wrote: > > > > > > > > ABataev wrote: > > > > > > > > > ABataev wrote: > > > > > > > > > > Hahnfeld wrote: > > > > > > > > > > > rjmccall wrote: > > > > > > > > > > > > Hahnfeld wrote: > > > > > > > > > > > > > rjmccall wrote: > > > > > > > > > > > > > > Hahnfeld wrote: > > > > > > > > > > > > > > > rjmccall wrote: > > > > > > > > > > > > > > > > The way you've written this makes it sound like > > > > > > > > > > > > > > > > "does the target support VLAs?", but the actual > > > > > > > > > > > > > > > > semantic checks treat it as "do OpenMP devices > > > > > > > > > > > > > > > > on this target support VLAs?" Maybe there > > > > > > > > > > > > > > > > should be a more specific way to query things > > > > > > > > > > > > > > > > about OpenMP devices instead of setting a > > > > > > > > > > > > > > > > global flag for the target? > > > > > > > > > > > > > > > Actually, the NVPTX and SPIR targets never > > > > > > > > > > > > > > > support VLAs. So I felt like it would be more > > > > > > > > > > > > > > > correct to make this a global property of the > > > > > > > > > > > > > > > target. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The difference is that the other programming > > > > > > > > > > > > > > > models (OpenCL and CUDA) error out immediatelyand > > > > > > > > > > > > > > > regardless of the target because this limitation > > > > > > > > > > > > > > > is reflected in the standards that disallow VLAs > > > > > > > > > > > > > > > (see SemaType.cpp). For OpenMP we might have > > > > > > > > > > > > > > > target devices that support VLA so we shouldn't > > > > > > > > > > > > > > > error out for those. > > > > > > > > > > > > > > If you want to make it a global property of the > > > > > > > > > > > > > > target, that's fine, but then I don't understand > > > > > > > > > > > > > > why your diagnostic only fires when > > > > > > > > > > > > > > (S.isInOpenMPDeclareTargetContext() || > > > > > > > > > > > > > > S.isInOpenMPTargetExecutionDirective()). > > > > > > > > > > > > > That is because of how OpenMP offloading works and > > > > > > > > > > > > > how it is implemented in Clang. Consider the > > > > > > > > > > > > > following snippet from the added test case: > > > > > > > > > > > > > ```lang=c > > > > > > > > > > > > > int vla[arg]; > > > > > > > > > > > > > > > > > > > > > > > > > > #pragma omp target map(vla[0:arg]) > > > > > > > > > > > > > { > > > > > > > > > > > > > // more code here... > > > > > > > > > > > > > } > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > Clang will take the following steps to compile this > > > > > > > > > > > > > into a working binary for a GPU: > > > > > > > > > > > > > 1. Parse and (semantically) analyze the code as-is > > > > > > > > > > > > > for the host and produce LLVM Bitcode. > > > > > > > > > > > > > 2. Parse and analyze again the code as-is and > > > > > > > > > > > > > generate code for the offloading target, the GPU in > > > > > > > > > > > > > this case. > > > > > > > > > > > > > 3. Take LLVM Bitcode from 1., generate host binary > > > > > > > > > > > > > and embed target binary from 3. > > > > > > > > > > > > > > > > > > > > > > > > > > `OpenMPIsDevice` will be true for 2., but the > > > > > > > > > > > > > complete source code is analyzed. So to not throw > > > > > > > > > > > > > errors for the host code, we have to make sure that > > > > > > > > > > > > > we are actually generating code for the target > > > > > > > > > > > > > device. This is either in a `target` directive or in > > > > > > > > > > > > > a `declare target` region. > > > > > > > > > > > > > Note that this is quite similar to what CUDA does, > > > > > > > > > > > > > only they have `CUDADiagIfDeviceCode` for this logic. > > > > > > > > > > > > > If you want me to add something of that kind for > > > > > > > > > > > > > OpenMP target devices, I'm fine with that. However > > > > > > > > > > > > > for the given case, it's a bit different because this > > > > > > > > > > > > > error should only be thrown for target devices that > > > > > > > > > > > > > don't support VLAs... > > > > > > > > > > > > I see. So the entire translation unit is re-parsed and > > > > > > > > > > > > re-Sema'ed from scratch for the target? Which means > > > > > > > > > > > > you need to avoid generating errors about things in the > > > > > > > > > > > > outer translation unit that aren't part of the target > > > > > > > > > > > > directive that you actually want to compile. I > > > > > > > > > > > > would've expected there to be some existing mechanism > > > > > > > > > > > > for that, to be honest, as opposed to explicitly trying > > > > > > > > > > > > to suppress target-specific diagnostics one by one. > > > > > > > > > > > Yes, that is my understanding. For errors, we don't need > > > > > > > > > > > to take anything special as the first `cc1` invocation > > > > > > > > > > > will exit with a non-zero status so that the driver stops > > > > > > > > > > > the compilation. For warnings, there seems to be no > > > > > > > > > > > mechanism in place as I see them duplicated, even in code > > > > > > > > > > > that is not generate for the target device (verified with > > > > > > > > > > > an unused variable). > > > > > > > > > > > > > > > > > > > > > > @ABataev @gtbercea Do I miss something here? > > > > > > > > > > I'm not aware of any. > > > > > > > > > John, target-specific checks require some special flags (like > > > > > > > > > LangOpts.Cuda) that are not set when we re-compile the code > > > > > > > > > for OpenMP devices. That's why errors are not emitted for the > > > > > > > > > non-target code. But also because of that, we need some > > > > > > > > > special OpenMP checks for target-specific code inside the > > > > > > > > > target regions. For example, code in lib/Sema/SemaType.cpp, > > > > > > > > > lines 2184, 2185 (see this file in this patch) checks for > > > > > > > > > Cuda compilation and prohibits using of VLAs in Cuda mode. We > > > > > > > > > also should prohibit using of VLAs in target code for NVPTX > > > > > > > > > devices or other devices that do not support VLAs in OpenMP > > > > > > > > > mode. > > > > > > > > I think it would be cleaner here, and better for our OpenMP > > > > > > > > support overall, if we found a more general way to suppress > > > > > > > > unwanted diagnostics in the second invocation for code outside > > > > > > > > of the target directive. This check (and several others) would > > > > > > > > then just implement a more general target feature disabling VLA > > > > > > > > support instead of being awkwardly OpenMP-specific. > > > > > > > I think to get this we would need to make `Diag` a no-op `if > > > > > > > (Context.getLangOpts().OpenMPIsDevice && > > > > > > > !(isInOpenMPDeclareTargetContext() || > > > > > > > isInOpenMPTargetExecutionDirective()))`. This would ignore all > > > > > > > diagnostics outside of the code is really generated in the end... > > > > > > I mean, the danger of this approach is that you don't really want > > > > > > to suppress diagnostics for top-level declarations: it can leave > > > > > > you with an invalid AST, and it is not valid to generate IR from an > > > > > > invalid AST. > > > > > > > > > > > > Sorry for the ignorant questions that follow, but I assume the > > > > > > OpenMP spec must bless this double-translation somehow, and I'd > > > > > > like to understand more about that in order to advise how to > > > > > > proceed. How does OpenMP handle the possibility that the code will > > > > > > be processed substantially differently for different targets? Is > > > > > > there some rule in the spec saying that the code has to expand "the > > > > > > same" in both targets? How does that work when e.g. size_t might > > > > > > have a different size or use a completely different type? More > > > > > > generally, how do expect that this feature will work in the more > > > > > > complicated language modes, like OpenMP + C++? > > > > > So if there is an error, the analysis will already fail on the host. > > > > > I think that guarantees that we don't end up with an invalid AST and > > > > > will at most suppress duplicate warnings. > > > > > > > > > > Regarding the OpenMP spec: I think the unsatisfying answer is that > > > > > the spec doesn't say what it expects on that questions. So I think > > > > > the compiler has to do what seems reasonable... > > > > Well, what I'm worried about is the possibility that something changes > > > > about the translation unit when it's reprocessed for the target — e.g. > > > > there's a target-dependent #if that causes an error in the target TU, > > > > but not in the original TU, so that the suppressed error is the only > > > > reason that the build fails. > > > > > > > > If the spec is unclear about this, then we just have to muddle through. > > > > Is this "reparse the whole translation unit for the target" the > > > > prevailing implementation technique for target directives? > > > If we are worried about that scenario we have the preserve the current > > > state: Do nothing, diagnose everything and let the user figure out if > > > there is an error in the code. > > > > > > I can't really comment on what other compilers (GCC, Intel) do, but at > > > least for GCC you compile a complete compiler for the target, so I > > > suppose they kind of do the same... > > John, reparsing is required to compile the target-specific code for the > > particular device. This is an intended behavior. Otherwise we may use some > > host-specific code on the device. Reparsing allows to reinclude all > > includes so that the code uses all the definitions for the target instead > > of those used for the host. > > Intel and gcc are doing the same > Okay, thanks, that's all I was asking. > > I am still concerned that this is going to gradually grow to encompass > basically every "XX feature is disabled on this target" diagnostic. We could > solve this more elegantly by tagging such diagnostics in the .td file and > then introducing a diagnostic filter that dropped such diagnostics except > when we're inside a target directive. > > Even if you don't do that, I'm going to insist that you rewrite the test in > SemaType so that it actually fires in normal code if the target disables > VLAs. The OpenMP-specific logic should be to suppress the diagnostic if we > happen to be in target mode and not within a target directive. > > Also, it seems to me that the "are we inside a target directive" check is > inadequate to deal with lazy-code-emission features like static functions in > C and templates in C++. The diagnostic must be emitted for a violation in a > function body if and only if the function is used from within the target > directive. That is, unless there's some very coarse language restriction > like forbidding external uses in target directives? > > Also, shouldn't the "are we in a target directive" be checking that we're in > a *matching* target directive? Surely a single OpenMP file can contain > target directives for multiple external targets? Ok, so going forward you suggest something like the following? ```lang=C++ // Some targets don't support VLAs. if (!Context.getTargetInfo().isVLASupported() && T->isVariableArrayType() && (!getLangOpts().OpenMPIsDevice || isInOpenMPDeclareTargetContext() || isInOpenMPTargetExecutionDirective())) { Diag(Loc, diag::err_target_no_vla_support); return QualType(); } ``` Doesn't lazy-code emission check the code semantically? Yes, the OpenMP standard (currently) mandates that all external functions that are called must be in a `declare target` region. And all regions in a file are compiled for all targets, so there shouldn't be a "non-matching" target directive. https://reviews.llvm.org/D39505 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits