jhuber6 added inline comments.
================ Comment at: clang/lib/CodeGen/TargetInfo.cpp:9436 + CGM.getModule(), Type, true, + llvm::GlobalValue::LinkageTypes::LinkOnceODRLinkage, + llvm::ConstantInt::get(Type, Value), Name, nullptr, ---------------- yaxunl wrote: > jhuber6 wrote: > > yaxunl wrote: > > > jhuber6 wrote: > > > > yaxunl wrote: > > > > > yaxunl wrote: > > > > > > jhuber6 wrote: > > > > > > > yaxunl wrote: > > > > > > > > This does not support per-TU control variables. Probably should > > > > > > > > use internal linkage. > > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was > > > > > > > the most appropriate here. It should mean that we can have > > > > > > > multiple identical definitions and they don't clash. There's also > > > > > > > no requirement for these to be emitted as symbols AFAIK. > > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was > > > > > > > the most appropriate here. It should mean that we can have > > > > > > > multiple identical definitions and they don't clash. There's also > > > > > > > no requirement for these to be emitted as symbols AFAIK. > > > > > > > > > > > > clang uses -mlink-builtin-bitcode to link these device libraries > > > > > > for HIP and OpenCL. Then the linkage of these variables becomes > > > > > > internal linkage. That's why it works for per-TU control. > > > > > > > The AMDGPU device libraries use `linkone_odr` so I figured it was > > > > > > > the most appropriate here. It should mean that we can have > > > > > > > multiple identical definitions and they don't clash. There's also > > > > > > > no requirement for these to be emitted as symbols AFAIK. > > > > > > > > > > > > clang uses -mlink-builtin-bitcode to link these device libraries > > > > > > for HIP and OpenCL. Then the linkage of these variables becomes > > > > > > internal linkage. That's why it works for per-TU control. > > > > > > > > > > You may let HIP and OpenCL use internal linkage and C/C++/OpenMP use > > > > > linkonce_odr since only HIP and OpenCL toolchain use > > > > > -mlink-builtin-bitcode to link these device libraries > > > > I see, `linkonce_odr` implies that these should all have the same value > > > > which isn't necessarily true after linking. I'll change it to use > > > > private linkage. > > > > > > > > OpenMP right now links everything late which means that we don't allow > > > > these to be defined differently per-TU. This may be incorrect given > > > > this new method as each TU will have different things set. I can change > > > > OpenMP to use the `mlink` method after this patch which may be more > > > > strictly correct. > > > > I see, `linkonce_odr` implies that these should all have the same value > > > > which isn't necessarily true after linking. I'll change it to use > > > > private linkage. > > > > > > > > OpenMP right now links everything late which means that we don't allow > > > > these to be defined differently per-TU. This may be incorrect given > > > > this new method as each TU will have different things set. I can change > > > > OpenMP to use the `mlink` method after this patch which may be more > > > > strictly correct. > > > > > > On second thoughts, the idea for letting clang to emit these control > > > variables might not work for HIP and OpenCL. The reason is that to > > > support per-TU control variables, these variables need to be internal or > > > private linkage, however, that means they cannot be used by other device > > > library functions which are expecting non-internal linkage for them. > > > Those device library functions will end up using control variables from > > > device library bitcode any way. > > > > > > For OpenMP, it may be necessary to emit them as linkonce_odr, otherwise > > > device library functions may not find them. > > > On second thoughts, the idea for letting clang to emit these control > > > variables might not work for HIP and OpenCL. The reason is that to > > > support per-TU control variables, these variables need to be internal or > > > private linkage, however, that means they cannot be used by other device > > > library functions which are expecting non-internal linkage for them. > > > Those device library functions will end up using control variables from > > > device library bitcode any way. > > > > Right now we include each file per-TU using `-mlink-builtin-bitcode` which > > converts `linkonce_odr` to `private` linkage. Shouldn't this be equivalent? > > It may be possible to make some test showing a user of these constants to > > verify they get picked up correctly. If you're worried about these getting > > removed we may be able to stash them in `compiler.used`, that shouldn't > > impede the necessary constant propagation. > > > > Side note, OpenCL seems to optimize these out without > > `-disable-llvm-optzns` while HIP will not. Does OpenCL use some mandatory > > passes to ensure that these control variables get handled? This method of > > using control constants in general is somewhat problematic as it hides > > invalid code behind some mandatory CP and DCE passes. For OpenMP right now > > we just generate one version for each architecture, which is wasteful but > > somewhat easier to work with. > > > > > On second thoughts, the idea for letting clang to emit these control > > > variables might not work for HIP and OpenCL. The reason is that to > > > support per-TU control variables, these variables need to be internal or > > > private linkage, however, that means they cannot be used by other device > > > library functions which are expecting non-internal linkage for them. > > > Those device library functions will end up using control variables from > > > device library bitcode any way. > > > > Right now we include each file per-TU using `-mlink-builtin-bitcode` which > > converts `linkonce_odr` to `private` linkage. Shouldn't this be equivalent? > > It may be possible to make some test showing a user of these constants to > > verify they get picked up correctly. If you're worried about these getting > > removed we may be able to stash them in `compiler.used`, that shouldn't > > impede the necessary constant propagation. > > > > Let's assume the main program calls `foo()` and `foo()` uses a control > variable `bar`. `foo()` is in a bitcode linked in by -mlink-builtin-bitcode. > clang emits the control variable `bar` with private linkage in the main > module. When clang tries to link `foo()`, it needs to resolve `bar`, but it > cannot use the `bar` in the main module because `bar` has private linkage. > Then `bar` becomes unresolved. > > > Side note, OpenCL seems to optimize these out without > > `-disable-llvm-optzns` while HIP will not. Does OpenCL use some mandatory > > passes to ensure that these control variables get handled? This method of > > using control constants in general is somewhat problematic as it hides > > invalid code behind some mandatory CP and DCE passes. For OpenMP right now > > we just generate one version for each architecture, which is wasteful but > > somewhat easier to work with. > > > > Are you using clang -cc1 without other options? There are LLVM passes by > default, but they should not depend on language. You can see which pass is > removing them by -mllvm -print-after-all. > >Let's assume the main program calls foo() and foo() uses a control variable >bar. foo() is in a bitcode linked in by -mlink-builtin-bitcode. clang emits >the control variable bar with private linkage in the main module. When clang >tries to link foo(), it needs to resolve bar, but it cannot use the bar in the >main module because bar has private linkage. Then bar becomes unresolved. This is a good point, we link only used definitions when using `-mlink-builtin-bitcode`. I think we link via `-mlink-builtin-bitcode` prior to running the backend, so this will be after we create these definitions. In this case we will import a definition like the following: ``` @__oclc_wavefrontsize64 = external local_unnamed_addr addrspace(4) constant i8, align 1 ``` which has the same name, but cannot bind to the `private` variable. I think this is what `linkonce` linkage is supposed to provide, but I'm not overly familiar with the semantics. > Are you using clang -cc1 without other options? There are LLVM passes by > default, but they should not depend on language. You can see which pass is > removing them by -mllvm -print-after-all. The sample tests in this patch show the `-x hip` version does not require `-disable-llvm-optzns` while the `-x cl` version does. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D130096/new/ https://reviews.llvm.org/D130096 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits