Issue 90761
Summary [Offload][OpenMP] Record-Replay not functioning - failure to allocate memory
Labels new issue
Assignees
Reporter nmustakin
    OpenMP offload recording is failing to allocate memory. It keeps requesting 0 bytes instead of the present `LIBOMPTARGET_RR_DEVMEM_SIZE`. 

For example when running 
`LIBOMPTARGET_DEBUG=1 LIBOMPTARGET_RR_DEVMEM_SIZE=4 LIBOMPTARGET_RR_SAVE_OUTPUT=1 OMP_TARGET_OFFLOAD=mandatory LIBOMPTARGET_NEXTGEN_PLUGINS=1 LIBOMPTARGET_RECORD=1 nvprof ./lulesh` the output shows - 

```
TARGET CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
PluginInterface --> Request 0 bytes allocated at (nil)
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
TARGET CUDA RTL --> Failure to alloc memory: Error in cuMemAlloc[Host|Managed]: out of memory
PluginInterface --> Allocated 14581039104 bytes at 0x7f33ce000000 for replay.
PluginInterface --> Record Replay Initialized (0x7f33ce000000) as starting address, 14581039104 Memory Size and set on status Recording
TARGET CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
PluginInterface --> Request 0 bytes allocated at (nil)
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
TARGET CUDA RTL --> Failure to alloc memory: Error in cuMemAlloc[Host|Managed]: out of memory
PluginInterface --> Allocated 14581039104 bytes at 0x7f305c000000 for replay.
PluginInterface --> Record Replay Initialized (0x7f305c000000) as starting address, 14581039104 Memory Size and set on status Recording
TARGET CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
PluginInterface --> Request 0 bytes allocated at (nil)
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
TARGET CUDA RTL --> Failure to alloc memory: Error in cuMemAlloc[Host|Managed]: out of memory
PluginInterface --> Allocated 14581039104 bytes at 0x7f2cea000000 for replay.
PluginInterface --> Record Replay Initialized (0x7f2cea000000) as starting address, 14581039104 Memory Size and set on status Recording
TARGET CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
PluginInterface --> Request 0 bytes allocated at (nil)
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
TARGET CUDA RTL --> Failure to alloc memory: Error in cuMemAlloc[Host|Managed]: out of memory
PluginInterface --> Allocated 14581039104 bytes at 0x7f2978000000 for replay.
PluginInterface --> Record Replay Initialized (0x7f2978000000) as starting address, 14581039104 Memory Size and set on status Recording
TARGET CUDA RTL --> The primary context is inactive, set its flags to CU_CTX_SCHED_BLOCKING_SYNC
PluginInterface --> Request 0 bytes allocated at (nil)
PluginInterface --> WARNING VA mapping failed, fallback to heuristic: (Error: Memory Map Size must be larger than 0)
TARGET CUDA RTL --> Failure to alloc memory: Error in cuMemAlloc[Host|Managed]: out of memory
PluginInterface --> Allocated 14581039104 bytes at 0x7f2606000000 for replay.
PluginInterface --> Record Replay Initialized (0x7f2606000000) as starting address, 14581039104 Memory Size and set on status Recording

```

as well as - 

```
omptarget --> Launching target execution __omp_offloading_821_1dc1092__ZL17CalcForceForNodesR6Domain_l1235 with pointer 0x0000555c717629f0 (index=1).
PluginInterface --> Launching kernel __omp_offloading_821_1dc1092__ZL17CalcForceForNodesR6Domain_l1235 with 931 blocks and 32 threads in SPMD mode
LLVM ERROR: Error retrieving data for target pointer
```

ending with only 1 out of 17 kernels being recorded 
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to