Currently, VM `PooledAllocator` releases its memory only when the underlying 
device  fails to allocate more memory: 
https://github.com/apache/tvm/blob/553778885388a9eff4d611e1022baecd75c69088/src/runtime/vm/pooled_allocator.h#L60-L65.
 This causes a program crash when doing repeated inferences with dynamic batch 
size. See https://github.com/apache/tvm/issues/8233#issuecomment-862664330 for 
a minimal repro.

It seems there are two issues with it:

1.  `AllocDataSpace` can be called outside of `PooledAllocator`, by 
`NDArray::Empty(...)` 
https://github.com/apache/tvm/blob/4d9bc9b4a3e9e8d3420efe60a52964fcd4c29c8d/src/runtime/ndarray.cc#L196-L197.
 That call is not protected by try/catch, so if almost all memory are held by 
`PooledAllocator` and `NDArray::Empty` is called, the program crashes with the 
following error:
```
terminate called after throwing an instance of 'tvm::runtime::InternalError'
  what():  [19:12:54] 
/home/masa/projects/dev/tvm/src/runtime/vulkan/vulkan_stream.cc:123: 
---------------------------------------------------------------
An error occurred during the execution of TVM.
For more information, please see: https://tvm.apache.org/docs/errors.html
---------------------------------------------------------------
  Check failed: (__e == VK_SUCCESS) is false: Vulkan Error, code=-13: Unknown 
Vulkan error code
Stack trace:
  0: tvm::runtime::vulkan::VulkanStream::Synchronize()
  1: _ZN3tvm7runtime6vulkan15VulkanDeviceAPI13FreeDataSpac
  2: tvm::runtime::NDArray::Internal::DefaultDeleter(tvm::runtime::Object*)
  3: tvm::runtime::NDArray::CopyTo(DLDevice const&) const
  4: tvm::runtime::vm::CopyTo(tvm::runtime::ObjectRef, DLDevice const&)
  5: std::_Function_handler<void (tvm::runtime::TVMArgs, 
tvm::runtime::TVMRetValue*), 
tvm::runtime::vm::VirtualMachine::GetFunction(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, 
tvm::runtime::ObjectPtr<tvm::runtime::Object> 
const&)::$_6>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, 
tvm::runtime::TVMRetValue*&&)
  6: TVMFuncCall
```
2. Even if I fix the above problem by making sure that all allocations go 
through `PooledAllocator`, my program still crashes due to too much allocation 
of host memory (haven't looked into why so much host memory is allocated when 
I'm running on a GPU target). Also, if I use the CPU target, the program is 
just killed after reaching the memory limit and before `try/catch` succeeds in 
catching memory allocation faiulure. 
 
So I think we need a better way to decide when to call `ReleaseAll()` early if 
necessary. Should we add a device API to query the max available memory and 
call `ReleaseAll()` when we reach say 90% ? This doesn't work if other 
memory-hungry processes are in use...

cc @ganler @comaniac @yuchenj @trevor-m for thought.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/vm-vm-pooledallocator-memory-release-strategy/10865/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/9b733dce186f082dcb84e7369188f4fc237e866015ad33ee7def2e185a197d66).

Reply via email to