Hi all,

While experimenting with different TensorRT versions, I noticed that
compatibility is tied closely to CUDA releases (e.g., TensorRT 8.x → CUDA
11.x, TensorRT 10.0.1 → CUDA 12.x, TensorRT 10.13 → CUDA 13.x).

I’m looking for feedback on design direction:

   -

   Should we maintain separate handlers for different TensorRT versions, or
   evolve the current handler to target only the latest TensorRT (10.x)?
   -

   In the existing code, load_onnx only parses ONNX to an engine but isn’t
   used downstream. In my prototype, I added _load_onnx_build_engine, which
   directly builds an engine from ONNX and then runs inference. Should this
   live in the same handler, or be split into an ONNX-specific handler
   separate from TensorRT?

This is my first open source contribution, so I’d greatly appreciate any
guidance on what would make sense long term for Beam.

>

Reply via email to