@masahi Yea, I also found this issue a few months ago. If there's an OOM, the exception will just flee... So I added another try/catch block and tried to fix that by calling `ReleaseAll` when OOM. The exception issue is very weird and I was not able to debug it (the exception just fled away and I cannot catch it during GDB).
I am not sure if calling `ReleaseAll` in advance could help. What about creating a global memory state per device (but it gonna be a big change)? Or simply unifying all memory allocation into a "PoolAllocator" (just like what TensorFlow did) which also enables users to control the memory limit. Or let's say the memory pool should not hold a super huge memory chunk (e.g., 1 GB). See also: https://github.com/apache/tvm/pull/8285 https://discuss.tvm.apache.org/t/logfatal-may-skip-some-important-errors-exceptions/10281 --- [Visit Topic](https://discuss.tvm.apache.org/t/vm-vm-pooledallocator-memory-release-strategy/10865/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/9f652b5dd346708b65f394df3ddd524f0bed66abd61325d080a87ac17c35e838).