[PR] Batched GPU dispatch and object caching for WebGPU runtime [tvm]

via GitHub Wed, 04 Mar 2026 08:58:04 -0800


mitiskuma opened a new pull request, #18871:
URL: https://github.com/apache/tvm/pull/18871


   ## Summary
   - Batch compute dispatches into a single GPUCommandEncoder, flushing on 
sync/readback instead of per-dispatch submit to reduce JS↔GPU transition 
overhead during LLM decode
   - Cache uniform buffers (FIFO/512), bind groups (FIFO/256), shape tuples, 
and pool MAP_READ staging buffers to eliminate redundant GPU object creation
   - Fix padding self-assignment bug in `deviceCopyToGPU`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Batched GPU dispatch and object caching for WebGPU runtime [tvm]

Reply via email to