Re: [PR] Batched GPU dispatch and object caching for WebGPU runtime [tvm]

via GitHub Thu, 05 Mar 2026 07:25:06 -0800


gemini-code-assist[bot] commented on code in PR #18871:
URL: https://github.com/apache/tvm/pull/18871#discussion_r2890682589



##########
web/src/runtime.ts:
##########
@@ -1674,8 +1679,17 @@ export class Instance implements Disposable {
    * @returns The created shape tuple.
    */
   makeShapeTuple(shape: Array<number>): TVMObject {
+    const key = shape.toString();
+    const cached = this.shapeTupleCache.get(key);
+    if (cached !== undefined) {
+      return cached;
+    }
     const shapeArray = shape.map((value) => new Scalar(value, "int"));
-    return this.ctx.makeShapeTuple(...shapeArray);
+    const tuple = this.ctx.makeShapeTuple(...shapeArray);
+    // Detach from scope so the cached object survives across scopes.
+    this.detachFromCurrentScope(tuple);
+    this.shapeTupleCache.set(key, tuple);
+    return tuple;
   }

Review Comment:
   
![security-medium](https://www.gstatic.com/codereviewagent/security-medium-priority.svg)
 ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The `shapeTupleCache` is unbounded and lacks an eviction policy, which is a 
significant security and stability concern. In long-running applications, this 
could lead to unbounded memory consumption and a potential Denial of Service 
(DoS) due to memory exhaustion. Consider adding a maximum size to the cache and 
an eviction policy (like LRU or FIFO) to manage its size, making the runtime 
more robust.



##########
web/src/runtime.ts:
##########
@@ -1674,8 +1679,17 @@ export class Instance implements Disposable {
    * @returns The created shape tuple.
    */
   makeShapeTuple(shape: Array<number>): TVMObject {
+    const key = shape.toString();
+    const cached = this.shapeTupleCache.get(key);
+    if (cached !== undefined) {
+      return cached;
+    }
     const shapeArray = shape.map((value) => new Scalar(value, "int"));
-    return this.ctx.makeShapeTuple(...shapeArray);
+    const tuple = this.ctx.makeShapeTuple(...shapeArray);
+    // Detach from scope so the cached object survives across scopes.
+    this.detachFromCurrentScope(tuple);
+    this.shapeTupleCache.set(key, tuple);
+    return tuple;
   }

Review Comment:
   
![security-medium](https://www.gstatic.com/codereviewagent/security-medium-priority.svg)
 ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The `shapeTupleCache` can store and return disposed `TVMObject` instances. 
If a user manually disposes a shape tuple returned by `makeShapeTuple`, the 
cache continues to hold the object with a null handle. Subsequent calls for the 
same shape will return this disposed object, causing any further operations on 
it to fail with an 'Object has already been disposed' error. This breaks the 
runtime's functionality for that specific shape.



##########
web/src/webgpu.ts:
##########
@@ -903,11 +973,15 @@ export class WebGPUContext {
     const copyCommands = copyEncoder.finish();
     this.device.queue.submit([copyCommands]);
 
-    gpuTemp.mapAsync(GPUMapMode.READ).then(() => {
-      const data = gpuTemp.getMappedRange();
+    const readPromise = gpuTemp.mapAsync(GPUMapMode.READ).then(() => {
+      const data = gpuTemp.getMappedRange(0, nbytes);
       this.memory.storeRawBytes(to, new Uint8Array(data));
-      gpuTemp.destroy();
+      this.returnReadStagingBuffer(gpuTemp);
     });
+    // Chain with any existing pending read so sync() awaits all of them.
+    this.pendingRead = this.pendingRead
+      ? this.pendingRead.then(() => readPromise)
+      : readPromise;
   }

Review Comment:
   
![security-medium](https://www.gstatic.com/codereviewagent/security-medium-priority.svg)
 ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The implementation of `deviceCopyFromGPU` introduces a potential data race 
and improper error handling in the `pendingRead` promise chain. 
   
   1. **Data Race**: Multiple concurrent calls to `deviceCopyFromGPU` targeting 
the same Wasm memory address may result in out-of-order writes. While 
`pendingRead` ensures `sync()` awaits all copies, the `mapAsync` callbacks 
themselves are not chained for execution order, meaning a later copy could be 
overwritten by an earlier one if its `mapAsync` resolves later.
   2. **Error Handling**: If a `readPromise` in the chain is rejected (e.g., 
due to device loss), the `pendingRead` chain will become rejected. Subsequent 
calls to `deviceCopyFromGPU` will append to this rejected promise using 
`.then(() => readPromise)`, which will skip the execution of the new 
`readPromise` in the chain. Consequently, `sync()` will throw the initial error 
but may return before subsequent `mapAsync` operations have finished, leading 
to potential use-after-free or data corruption if the Wasm memory is accessed 
immediately after.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Batched GPU dispatch and object caching for WebGPU runtime [tvm]

Reply via email to