tianshilei1992 created this revision.
Herald added subscribers: dexonsmith, dang, guansong, yaxunl.
tianshilei1992 requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Currently an OpenMP thread is mapped to a hardware thread. In order to support
SIMD, we have to map an OpenMP thread to a warp (wavefront). This mapping has to
be determined when the kernel is launched, and the execution mode is encoded in
the `int8_t Mode` when calling `__kmpc_target_init`, which is introduced in
D110279 <https://reviews.llvm.org/D110279>. However, we cannot determine if 
`simd` is used and then adjust `Mode`
accordingly in current Clang CodeGen because the function call to
`__kmpc_target_init` is emitted before the body of target region.

This patches adds a new clang argument `-fopenmp-target-simd` to emit code that
supports SIMD mapping. When this argument is set, no matter whether there is
`simd` directive in target region, the new mappig is always used. If it is not
set or `-fopenmp-no-target-simd` is set, the existing mapping will be used, and
`simd` directive will be ignored.

The reason we don't reuse `-fopenmp-simd` is the CodeGen of device code shares
some implementation with host code. `-fopenmp-simd` can change CodeGen, which we
don't expect to introduce any unknown effects.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D110286

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Frontend/CompilerInvocation.cpp

Index: clang/lib/Frontend/CompilerInvocation.cpp
===================================================================
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -3782,8 +3782,8 @@
   // '-mignore-xcoff-visibility' is implied. The generated command line will
   // contain both '-fvisibility default' and '-mignore-xcoff-visibility' and
   // subsequent calls to `CreateFromArgs`/`generateCC1CommandLine` will always
-  // produce the same arguments. 
- 
+  // produce the same arguments.
+
   if (T.isOSAIX() && (Args.hasArg(OPT_mignore_xcoff_visibility) ||
                       !Args.hasArg(OPT_fvisibility)))
     Opts.IgnoreXCOFFVisibility = 1;
@@ -3863,6 +3863,10 @@
       Opts.OpenMP && Args.hasArg(options::OPT_fopenmp_enable_irbuilder);
   bool IsTargetSpecified =
       Opts.OpenMPIsDevice || Args.hasArg(options::OPT_fopenmp_targets_EQ);
+  Opts.OpenMPTargetSimd =
+      IsTargetSpecified &&
+      Args.hasFlag(options::OPT_fopenmp_target_simd,
+                   options::OPT_fno_openmp_target_simd, /*Default=*/false);
   Opts.OpenMPTargetNewRuntime =
       Opts.OpenMPIsDevice &&
       Args.hasArg(options::OPT_fopenmp_target_new_runtime);
Index: clang/include/clang/Driver/Options.td
===================================================================
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -1805,7 +1805,7 @@
 
 defm protect_parens : BoolFOption<"protect-parens",
   LangOpts<"ProtectParens">, DefaultFalse,
-  PosFlag<SetTrue, [CoreOption, CC1Option], 
+  PosFlag<SetTrue, [CoreOption, CC1Option],
           "Determines whether the optimizer honors parentheses when "
           "floating-point expressions are evaluated">,
   NegFlag<SetFalse>>;
@@ -2406,9 +2406,12 @@
   Group<f_Group>, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 def fopenmp_simd : Flag<["-"], "fopenmp-simd">, Group<f_Group>, Flags<[CC1Option, NoArgumentUnused]>,
   HelpText<"Emit OpenMP code only for SIMD-based constructs.">;
+def fopenmp_target_simd : Flag<["-"], "fopenmp-target-simd">, Group<f_Group>, Flags<[CC1Option, NoArgumentUnused]>,
+  HelpText<"Emit OpenMP target offloading code that supports SIMD execution.">;
 def fopenmp_enable_irbuilder : Flag<["-"], "fopenmp-enable-irbuilder">, Group<f_Group>, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>,
   HelpText<"Use the experimental OpenMP-IR-Builder codegen path.">;
 def fno_openmp_simd : Flag<["-"], "fno-openmp-simd">, Group<f_Group>, Flags<[CC1Option, NoArgumentUnused]>;
+def fno_openmp_target_simd : Flag<["-"], "fno-openmp-target-simd">, Group<f_Group>, Flags<[CC1Option, NoArgumentUnused]>;
 def fopenmp_cuda_mode : Flag<["-"], "fopenmp-cuda-mode">, Group<f_Group>,
   Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 def fno_openmp_cuda_mode : Flag<["-"], "fno-openmp-cuda-mode">, Group<f_Group>,
@@ -2548,7 +2551,7 @@
   ShouldParseIf<!strconcat("!", open_cl.KeyPath)>;
 defm split_stack : BoolFOption<"split-stack",
   CodeGenOpts<"EnableSegmentedStacks">, DefaultFalse,
-  NegFlag<SetFalse, [], "Wouldn't use segmented stack">, 
+  NegFlag<SetFalse, [], "Wouldn't use segmented stack">,
   PosFlag<SetTrue, [CC1Option], "Use segmented stack">>;
 def fstack_protector_all : Flag<["-"], "fstack-protector-all">, Group<f_Group>,
   HelpText<"Enable stack protectors for all functions">;
@@ -4575,7 +4578,7 @@
   HelpText<"Enable the old style PARAMETER statement">;
 def fintrinsic_modules_path : Separate<["-"], "fintrinsic-modules-path">,  Group<f_Group>, MetaVarName<"<dir>">,
   HelpText<"Specify where to find the compiled intrinsic modules">,
-  DocBrief<[{This option specifies the location of pre-compiled intrinsic modules, 
+  DocBrief<[{This option specifies the location of pre-compiled intrinsic modules,
   if they are not in the default location expected by the compiler.}]>;
 
 defm backslash : OptInFC1FFlag<"backslash", "Specify that backslash in string introduces an escape character">;
Index: clang/include/clang/Basic/LangOptions.def
===================================================================
--- clang/include/clang/Basic/LangOptions.def
+++ clang/include/clang/Basic/LangOptions.def
@@ -233,6 +233,7 @@
 LANGOPT(OpenMP            , 32, 0, "OpenMP support and version of OpenMP (31, 40 or 45)")
 LANGOPT(OpenMPExtensions  , 1, 1, "Enable all Clang extensions for OpenMP directives and clauses")
 LANGOPT(OpenMPSimd        , 1, 0, "Use SIMD only OpenMP support.")
+LANGOPT(OpenMPTargetSimd  , 1, 0, "Use OpenMP target offloading SIMD support.")
 LANGOPT(OpenMPUseTLS      , 1, 0, "Use TLS for threadprivates or runtime calls")
 LANGOPT(OpenMPIsDevice    , 1, 0, "Generate code only for OpenMP target device")
 LANGOPT(OpenMPCUDAMode    , 1, 0, "Generate code for OpenMP pragmas in SIMT/SPMD mode")
@@ -425,7 +426,7 @@
 
 LANGOPT(ArmSveVectorBits, 32, 0, "SVE vector size in bits")
 
-ENUM_LANGOPT(ExtendIntArgs, ExtendArgsKind, 1, ExtendArgsKind::ExtendTo32, 
+ENUM_LANGOPT(ExtendIntArgs, ExtendArgsKind, 1, ExtendArgsKind::ExtendTo32,
              "Controls how scalar integer arguments are extended in calls "
              "to unprototyped and varargs functions")
 
Index: clang/docs/ClangCommandLineReference.rst
===================================================================
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -2037,6 +2037,10 @@
 
 Emit OpenMP code only for SIMD-based constructs.
 
+.. option:: -fopenmp-target-simd, -fno-openmp-target-simd
+
+Emit OpenMP target offloading code that supports SIMD execution.
+
 .. option:: -fopenmp-version=<arg>
 
 .. option:: -fopenmp-extensions, -fno-openmp-extensions
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to