shiltian wrote:
> the target regions are just outlined, so it shouldn't affect anything on a
> codegen level.
No, they are not. The standard defines the execution behavior and codegen has
to conform with it. The current GPU CodeGen in this discussion assumes it is
generating for constructs _i
jhuber6 wrote:
> > All that said, there are two cases to consider wrt. the standard:
> >
> > 1. The initial device is the CPU and the code compiled here is just part of
> > a GPU library, or
> > 2. the initial device is the GPU and the code compiled here is just part of
> > the "host code".
>
kparzysz wrote:
> All that said, there are two cases to consider wrt. the standard:
>
> 1. The initial device is the CPU and the code compiled here is just part of a
> GPU library, or
> 2. the initial device is the GPU and the code compiled here is just part of
> the "host code".
>
> For 1),
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/122149
>From 3329b7ae7dc6044f6563f218c65f6af7498290f0 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 8 Jan 2025 12:19:53 -0600
Subject: [PATCH 1/3] [OpenMP] Allow GPUs to be targeted directly via
`-fopenmp`.
jhuber6 wrote:
> Do we have a clear idea on if a construct can behave in a different manner if
> it is nested in a target region?
Unsure exactly, the target regions are just outlined, so it shouldn't affect
anything on a codegen level.
https://github.com/llvm/llvm-project/pull/122149
https://github.com/shiltian commented:
Do we have a clear idea on if a construct can behave in a different manner if
it is nested in a target region?
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.
@@ -1312,6 +1309,19 @@ void CGOpenMPRuntimeGPU::emitBarrierCall(CodeGenFunction
&CGF,
Args);
}
+void CGOpenMPRuntimeGPU::emitTargetCall(
+CodeGenFunction &CGF, const OMPExecutableDirective &D,
+llvm::Function *OutlinedFn, llvm::Value *OutlinedFnI
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/122149
>From 3329b7ae7dc6044f6563f218c65f6af7498290f0 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 8 Jan 2025 12:19:53 -0600
Subject: [PATCH 1/2] [OpenMP] Allow GPUs to be targeted directly via
`-fopenmp`.
jdoerfert wrote:
> I get it, but that doesn't look like the case. If you look at the test case,
> the `target` region in `bar` is simply ignored. To me this looks like
> treating the entire TU being wrapped into a giant target region instead of
> compiling for host.
That is a good point. I th
shiltian wrote:
> To me this looks like compilation for a host, except the GPU is the host. The
> only functions that could be called from such a CU would be the top-level
> ones, not any of the auto-generated one.
>
> Additionally, the host wouldn't support offload, so we'd need to do somethi
jdoerfert wrote:
> I see it in a different way. `#pragma omp target parallel` (let's just assume
> this is valid code) is different from `#pragma omp parallel`, no matter what
> target is. However, this patch is to say, when targeting a GPU, `#pragma omp
> parallel` **is** `#pragma omp target
kparzysz wrote:
To me this looks like compilation for a host, except the GPU is the host. The
only functions that could be called from such a CU would be the top-level ones,
not any of the auto-generated one.
Additionally, the host wouldn't support offload, so we'd need to do something
about
jhuber6 wrote:
I guess I'll see how much I favor this approach depending on how much more
difficult it is to build the DeviceRTL without OpenMP. I think the only thing
we'd miss is the `#pragma omp assumes(...)` business, which might have another
way to be emitted in LLVM/Clang?
https://githu
https://github.com/jdoerfert approved this pull request.
> We can't expect to have regular OpenMP code working in the same way as OpenMP
> offloading code when targeting a GPU meanwhile the code is not wrapped into
> target region or declare target
The way I see this is:
If the target is a GPU
shiltian wrote:
> It should maintain the normal semantics you'd get with -fopenmp except it
> codegens certain things differently.
That is the key difference.
> Alternatively I could just remove OpenMP entirely from the DeviceRTL so I
> might just do that instead.
+1
https://github.com/llvm
jhuber6 wrote:
> Maybe just turn on OpenMPIsTargetDevice if `gpu target + -fopenmp` is
> specified?
Doesn't work, it causes all definitions to be stripped as they are not declared
on the device, which is not what we want.
https://github.com/llvm/llvm-project/pull/122149
__
shiltian wrote:
I think that is a misuse of OpenMP semantics. We can't expect to have regular
OpenMP code working in the same way as OpenMP offloading code when targeting a
GPU meanwhile while the code is not wrapped into `target` region. I understand
to have variants and declare target is not
jhuber6 wrote:
> I think that is a misuse of OpenMP semantics. We can't expect to have regular
> OpenMP code working in the same way as OpenMP offloading code when targeting
> a GPU meanwhile the code is not wrapped into `target` region or declare
> target. I understand to have variants and de
jhuber6 wrote:
> ~I don't think it should be GPU code generation path as there is no explicit
> `target` region used.~ Probably I missed something here. Do you expect
> regular OpenMP stuff such as `parallel` region to be emitted in the same way
> as offloading code?
Yes, the example in the d
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/122149
>From 3329b7ae7dc6044f6563f218c65f6af7498290f0 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 8 Jan 2025 12:19:53 -0600
Subject: [PATCH] [OpenMP] Allow GPUs to be targeted directly via `-fopenmp`.
Summa
jhuber6 wrote:
> I don't think it should be GPU code generation path as there is no explicit
> `target` region used.
it needs to be, otherwise the code generation for things like `#pragma omp
parallel` will be wrong. The way I see it, the DeviceRTL is `libomp.a` for the
GPU target, so we need
shiltian wrote:
I don't think it should be GPU code generation path as there is no explicit
`target` region used.
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/m
alexey-bataev wrote:
> What code generation path would be used in this case? The GPU code generation
> or regular host OpenMP?
gpu device code
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
ht
jhuber6 wrote:
> What code generation path would be used in this case? The GPU code generation
> or regular host OpenMP?
The GPU path, I'm treating that as the code generation path that created
correct runtime code for the GPU. I.e. you can link it with your OpenMP
offloading program and it'l
shiltian wrote:
What code generation path would be used in this case? The GPU code generation
or regular host OpenMP?
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-b
jhuber6 wrote:
> Maybe just turn on OpenMPIsTargetDevice if `gpu target + -fopenmp` is
> specified?
I'll give it a try.
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cg
alexey-bataev wrote:
Maybe just turn on OpenMPIsTargetDevice if `gpu target + -fopenmp` is specified?
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listin
llvmbot wrote:
@llvm/pr-subscribers-clang
Author: Joseph Huber (jhuber6)
Changes
Summary:
Currently we prevent the following from working. However, it is
completely reasonable to be able to target the files individually.
```
$ clang --target=amdgcn-amd-amdhsa -fopenmp
```
This patch lifts
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/122149
Summary:
Currently we prevent the following from working. However, it is
completely reasonable to be able to target the files individually.
```
$ clang --target=amdgcn-amd-amdhsa -fopenmp
```
This patch lifts th
29 matches
Mail list logo