This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a04418d308b deploying docs 
(apache/tvm@11ac2ad97c84fdc6ca14690a6221e9f3d1631955)
a04418d308b is described below

commit a04418d308b0f40e35103c564ebad51a8a77758e
Author: tvm-bot <[email protected]>
AuthorDate: Tue Apr 28 01:35:34 2026 +0000

    deploying docs (apache/tvm@11ac2ad97c84fdc6ca14690a6221e9f3d1631955)
---
 .../import_model.zip                               | Bin 35138 -> 35138 bytes
 .../dlight_gpu_scheduling.zip                      | Bin 26311 -> 26311 bytes
 .../11c11e53c7dace51a8be968ee169ed0d/ir_module.zip | Bin 23790 -> 23790 bytes
 .../tir_transformation.zip                         | Bin 15899 -> 15899 bytes
 .../meta_schedule.zip                              | Bin 24239 -> 24239 bytes
 .../mix_python_and_tvm_with_pymodule.zip           | Bin 39005 -> 39005 bytes
 .../relax_creation.zip                             | Bin 22420 -> 22420 bytes
 .../relax_transformation.zip                       | Bin 11480 -> 11480 bytes
 .../optimize_llm.zip                               | Bin 54007 -> 54007 bytes
 .../bring_your_own_codegen.zip                     | Bin 18423 -> 18423 bytes
 .../e2e_opt_model.zip                              | Bin 14501 -> 14501 bytes
 .../quick_start.zip                                | Bin 16252 -> 16252 bytes
 .../export_and_load_executable.zip                 | Bin 31526 -> 31526 bytes
 .../tir_creation.zip                               | Bin 24415 -> 24415 bytes
 .../cross_compilation_and_rpc.zip                  | Bin 49406 -> 49406 bytes
 .../customize_opt.zip                              | Bin 20544 -> 20544 bytes
 .../relax/tutorials/sg_execution_times.rst.txt     |   6 ++--
 .../tensor_ir/tutorials/sg_execution_times.rst.txt |   8 ++---
 .../tensor_ir/tutorials/tir_creation.rst.txt       |  20 ++++++------
 .../tensor_ir/tutorials/tir_transformation.rst.txt |   6 ++--
 .../get_started/tutorials/ir_module.rst.txt        |   8 ++---
 .../get_started/tutorials/quick_start.rst.txt      |   4 +--
 .../tutorials/sg_execution_times.rst.txt           |   6 ++--
 .../tutorials/cross_compilation_and_rpc.rst.txt    |   6 ++--
 .../how_to/tutorials/customize_opt.rst.txt         |   4 +--
 .../how_to/tutorials/e2e_opt_model.rst.txt         |   2 +-
 .../how_to/tutorials/sg_execution_times.rst.txt    |  16 +++++-----
 docs/_sources/sg_execution_times.rst.txt           |  34 ++++++++++-----------
 .../relax/tutorials/sg_execution_times.html        |   6 ++--
 .../tensor_ir/tutorials/sg_execution_times.html    |   8 ++---
 .../tensor_ir/tutorials/tir_creation.html          |  20 ++++++------
 .../tensor_ir/tutorials/tir_transformation.html    |   6 ++--
 docs/get_started/tutorials/ir_module.html          |  16 +++++-----
 docs/get_started/tutorials/quick_start.html        |  24 +++++++--------
 docs/get_started/tutorials/sg_execution_times.html |   6 ++--
 docs/how_to/tutorials/bring_your_own_codegen.html  |   4 +--
 .../tutorials/cross_compilation_and_rpc.html       |   6 ++--
 docs/how_to/tutorials/customize_opt.html           |   8 ++---
 docs/how_to/tutorials/e2e_opt_model.html           |   6 ++--
 .../tutorials/export_and_load_executable.html      |   8 ++---
 docs/how_to/tutorials/import_model.html            |   4 +--
 docs/how_to/tutorials/optimize_llm.html            |  10 +++---
 docs/how_to/tutorials/sg_execution_times.html      |  16 +++++-----
 docs/objects.inv                                   | Bin 22797 -> 22791 bytes
 docs/reference/api/python/relax/training.html      |   2 +-
 docs/reference/api/python/runtime/vm.html          |   2 +-
 docs/searchindex.js                                |   2 +-
 docs/sg_execution_times.html                       |  34 ++++++++++-----------
 48 files changed, 154 insertions(+), 154 deletions(-)

diff --git a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip 
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip
index 1fff4358494..8688f223fd1 100644
Binary files 
a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip and 
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip differ
diff --git 
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
index 37226cc1d80..1f5eaee439a 100644
Binary files 
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
and 
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
differ
diff --git a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip 
b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
index e45b4f57b8d..b14f5948f3f 100644
Binary files a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip 
and b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip differ
diff --git 
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip 
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
index 46f4c6cea75..b1656f70387 100644
Binary files 
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip and 
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip differ
diff --git a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip 
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip
index e9f455c2afc..a5e7340c725 100644
Binary files 
a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip and 
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip differ
diff --git 
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
index fbdad4289a2..076ae8e14ad 100644
Binary files 
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 and 
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 differ
diff --git 
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip 
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
index fc9573d28e6..11abde1fb0f 100644
Binary files 
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip and 
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip differ
diff --git 
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip 
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
index 3da10618f2f..a87f426ca1e 100644
Binary files 
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip and 
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip 
differ
diff --git a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip 
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
index 470e86034c0..5f92d4cc7a0 100644
Binary files 
a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip and 
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip differ
diff --git 
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
index c586e80af09..40efca60c2e 100644
Binary files 
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
and 
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
differ
diff --git a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip 
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
index 896dec546f7..30a0ac429a1 100644
Binary files 
a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip and 
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip differ
diff --git a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip 
b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
index 2ca587f6c22..b19d353e09c 100644
Binary files a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip 
and b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip differ
diff --git 
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
index 84c077b5989..9a97e718f49 100644
Binary files 
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 and 
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 differ
diff --git a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip 
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
index 2f55b43449f..90b0a04ed13 100644
Binary files 
a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip and 
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip differ
diff --git 
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
index 744173ba84f..cf2ed4feaed 100644
Binary files 
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 and 
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 differ
diff --git a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip 
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
index 84429823f41..505da3ecdad 100644
Binary files 
a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip and 
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip differ
diff --git a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
index b7aef0c7833..28a81413be0 100644
--- a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:00.168** total execution time for 2 files **from 
deep_dive/relax/tutorials**:
+**00:00.173** total execution time for 2 files **from 
deep_dive/relax/tutorials**:
 
 .. container::
 
@@ -33,8 +33,8 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py` 
(``relax_creation.py``)
-     - 00:00.115
+     - 00:00.118
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py` 
(``relax_transformation.py``)
-     - 00:00.053
+     - 00:00.055
      - 0.0
diff --git 
a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
index 261190f53c4..572e602ed74 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:00.574** total execution time for 4 files **from 
deep_dive/tensor_ir/tutorials**:
+**00:00.583** total execution time for 4 files **from 
deep_dive/tensor_ir/tutorials**:
 
 .. container::
 
@@ -33,13 +33,13 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py` 
(``tir_transformation.py``)
-     - 00:00.269
+     - 00:00.272
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py` 
(``tir_creation.py``)
-     - 00:00.182
+     - 00:00.186
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py` 
(``dlight_gpu_scheduling.py``)
-     - 00:00.116
+     - 00:00.117
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py` 
(``meta_schedule.py``)
      - 00:00.007
diff --git a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
index 0ad66555bfd..372935028cb 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
@@ -319,17 +319,17 @@ Now let's check the runtime dynamic shape inference:
 
  .. code-block:: none
 
-    [[1.8622421  0.67286646 1.282362   1.0511451 ]
-     [2.0764782  0.5306615  1.3665953  1.4980947 ]
-     [1.3380854  0.2942735  0.87282985 0.9297526 ]
-     [1.9473305  0.71752125 1.2324657  1.3434483 ]]
-    [[30.725405 27.933485 33.42631  ... 28.76254  32.434967 25.542963]
-     [28.348495 25.882704 31.631708 ... 30.102102 32.523552 23.54295 ]
-     [31.881264 32.651657 37.273735 ... 31.474817 35.480404 28.237606]
+    [[0.8605255  0.5471485  0.8039893  1.0729661 ]
+     [1.1034956  1.2353773  0.8464414  1.6922977 ]
+     [0.4042995  1.1864915  0.8314758  1.1582114 ]
+     [1.5709348  0.53600913 1.079339   1.77458   ]]
+    [[31.28486  32.598347 34.754593 ... 34.229595 34.81411  33.876446]
+     [28.932312 32.171654 32.346653 ... 29.917728 31.682928 33.08734 ]
+     [28.484785 30.425142 32.392727 ... 31.919804 32.438267 30.258091]
      ...
-     [27.833437 28.467497 31.901064 ... 28.129526 31.220873 25.711681]
-     [32.942955 32.215992 36.025925 ... 32.050587 36.881718 29.219858]
-     [28.208681 25.811222 31.636063 ... 26.905252 29.956907 23.83386 ]]
+     [30.73488  31.195097 33.93179  ... 31.291265 32.389244 34.30623 ]
+     [26.5504   29.39458  29.45073  ... 28.97972  31.037825 27.498674]
+     [28.01241  28.874033 29.861454 ... 30.790987 30.821604 29.927006]]
 
 
 
diff --git 
a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
index dd472f5c2e2..60542244c58 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
@@ -120,7 +120,7 @@ original implementation.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       2.5176       2.5176       2.5176       2.5176       0.0000              
    
+       2.5877       2.5877       2.5877       2.5877       0.0000              
    
 
 
 
@@ -292,7 +292,7 @@ action involves reordering these two loops.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       0.8615       0.8615       0.8615       0.8615       0.0000              
    
+       0.8656       0.8656       0.8656       0.8656       0.0000              
    
 
 
 
@@ -420,7 +420,7 @@ from the reduction update via the **decompose_reduction** 
primitive.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       0.3430       0.3430       0.3430       0.3430       0.0000              
    
+       0.3503       0.3503       0.3503       0.3503       0.0000              
    
 
 
 
diff --git a/docs/_sources/get_started/tutorials/ir_module.rst.txt 
b/docs/_sources/get_started/tutorials/ir_module.rst.txt
index 77f74a4cd2b..170c0738de3 100644
--- a/docs/_sources/get_started/tutorials/ir_module.rst.txt
+++ b/docs/_sources/get_started/tutorials/ir_module.rst.txt
@@ -694,8 +694,8 @@ We can deploy the IRModule on CPU by specifying the target 
as ``llvm``.
 
  .. code-block:: none
 
-    [[ 0.08104634  0.00242352  0.1049604  -0.08137349 -0.1480625  -0.08869022
-      -0.09602349  0.00385824 -0.15466967 -0.10502834]]
+    [[ 0.07247547  0.03547548 -0.10874876 -0.06546346  0.12311915 -0.06342966
+       0.09690153  0.11260182 -0.07364957  0.16807812]]
 
 
 
@@ -761,8 +761,8 @@ Now we can compile the IRModule on GPU, the similar way as 
we did on CPU.
 
  .. code-block:: none
 
-    [[ 0.08104636  0.00242357  0.10496042 -0.0813735  -0.14806242 -0.08869025
-      -0.09602351  0.00385825 -0.15466967 -0.1050284 ]]
+    [[ 0.07247549  0.03547547 -0.10874879 -0.06546341  0.12311919 -0.06342966
+       0.09690155  0.11260177 -0.07364966  0.16807815]]
 
 
 
diff --git a/docs/_sources/get_started/tutorials/quick_start.rst.txt 
b/docs/_sources/get_started/tutorials/quick_start.rst.txt
index 984e5a64c34..ecca53764bb 100644
--- a/docs/_sources/get_started/tutorials/quick_start.rst.txt
+++ b/docs/_sources/get_started/tutorials/quick_start.rst.txt
@@ -224,8 +224,8 @@ different devices.
 
  .. code-block:: none
 
-    [[23730.143 27612.406 26144.84  25774.523 26365.133 24124.285 26677.041
-      25740.312 24821.139 24888.467]]
+    [[25344.477 25751.967 25929.48  26905.932 26182.744 24975.156 25076.4
+      26959.822 25762.361 25306.086]]
 
 
 
diff --git a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
index 6df675b85a8..b9fddeb1459 100644
--- a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:06.920** total execution time for 2 files **from get_started/tutorials**:
+**00:07.593** total execution time for 2 files **from get_started/tutorials**:
 
 .. container::
 
@@ -33,8 +33,8 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_get_started_tutorials_ir_module.py` (``ir_module.py``)
-     - 00:06.762
+     - 00:07.439
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_quick_start.py` 
(``quick_start.py``)
-     - 00:00.158
+     - 00:00.154
      - 0.0
diff --git a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt 
b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
index 01921747715..64d8ef27da3 100644
--- a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
@@ -269,7 +269,7 @@ device and returns the measured cost. Network overhead is 
excluded.
 
  .. code-block:: none
 
-    9.1e-08 secs/op
+    1.08e-07 secs/op
 
 
 
@@ -671,8 +671,8 @@ This workflow is applicable to various deployment scenarios:
     Converted PyTorch model to Relax:
       - Number of parameters: 4
     Using local target for demonstration
-    Exported library to: /tmp/tmpms6e2cvm/model_deployed.so
-    Saved parameters to: /tmp/tmpms6e2cvm/model_params.npz
+    Exported library to: /tmp/tmpgixoiye1/model_deployed.so
+    Saved parameters to: /tmp/tmpgixoiye1/model_params.npz
 
     RPC workflow (works for any remote device):
     ==================================================
diff --git a/docs/_sources/how_to/tutorials/customize_opt.rst.txt 
b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
index fa7fe029bc7..ffa45814c9c 100644
--- a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
+++ b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
@@ -425,8 +425,8 @@ We can build and deploy the optimized model to the TVM 
runtime.
 
  .. code-block:: none
 
-    [[25077.996 26146.457 26063.613 23740.695 24852.195 25097.496 23488.072
-      23641.996 26411.695 24865.14 ]]
+    [[26192.203 26975.654 25755.906 25443.984 27058.223 26124.715 26480.742
+      24814.836 26310.633 25108.781]]
 
 
 
diff --git a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt 
b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
index da12d907035..7189b3bee2f 100644
--- a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
+++ b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
@@ -54,7 +54,7 @@ PyTorch.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth"; 
to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-       0%|          | 0.00/44.7M [00:00<?, ?B/s]      69%|██████▉   | 
31.0M/44.7M [00:00<00:00, 324MB/s]     100%|██████████| 44.7M/44.7M 
[00:00<00:00, 323MB/s]
+       0%|          | 0.00/44.7M [00:00<?, ?B/s]      60%|█████▉    | 
26.6M/44.7M [00:00<00:00, 279MB/s]     100%|██████████| 44.7M/44.7M 
[00:00<00:00, 313MB/s]
 
 
 
diff --git a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
index 99502be1ff1..c41125c9a11 100644
--- a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:15.095** total execution time for 8 files **from how_to/tutorials**:
+**00:15.246** total execution time for 8 files **from how_to/tutorials**:
 
 .. container::
 
@@ -33,25 +33,25 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` (``optimize_llm.py``)
-     - 00:10.057
+     - 00:10.085
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_import_model.py` (``import_model.py``)
-     - 00:03.313
+     - 00:03.352
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` (``e2e_opt_model.py``)
-     - 00:00.570
+     - 00:00.603
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` (``customize_opt.py``)
-     - 00:00.536
+     - 00:00.585
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py` 
(``cross_compilation_and_rpc.py``)
-     - 00:00.475
+     - 00:00.474
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py` 
(``bring_your_own_codegen.py``)
-     - 00:00.138
+     - 00:00.139
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_mix_python_and_tvm_with_pymodule.py` 
(``mix_python_and_tvm_with_pymodule.py``)
-     - 00:00.004
+     - 00:00.005
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_export_and_load_executable.py` 
(``export_and_load_executable.py``)
      - 00:00.002
diff --git a/docs/_sources/sg_execution_times.rst.txt 
b/docs/_sources/sg_execution_times.rst.txt
index 662d7a9dc46..501c0c19254 100644
--- a/docs/_sources/sg_execution_times.rst.txt
+++ b/docs/_sources/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:22.757** total execution time for 16 files **from all galleries**:
+**00:23.594** total execution time for 16 files **from all galleries**:
 
 .. container::
 
@@ -33,49 +33,49 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` 
(``../how_to/tutorials/optimize_llm.py``)
-     - 00:10.057
+     - 00:10.085
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_ir_module.py` 
(``../get_started/tutorials/ir_module.py``)
-     - 00:06.762
+     - 00:07.439
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_import_model.py` 
(``../how_to/tutorials/import_model.py``)
-     - 00:03.313
+     - 00:03.352
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` 
(``../how_to/tutorials/e2e_opt_model.py``)
-     - 00:00.570
+     - 00:00.603
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` 
(``../how_to/tutorials/customize_opt.py``)
-     - 00:00.536
+     - 00:00.585
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py` 
(``../how_to/tutorials/cross_compilation_and_rpc.py``)
-     - 00:00.475
+     - 00:00.474
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py` 
(``../deep_dive/tensor_ir/tutorials/tir_transformation.py``)
-     - 00:00.269
+     - 00:00.272
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py` 
(``../deep_dive/tensor_ir/tutorials/tir_creation.py``)
-     - 00:00.182
+     - 00:00.186
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_quick_start.py` 
(``../get_started/tutorials/quick_start.py``)
-     - 00:00.158
+     - 00:00.154
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py` 
(``../how_to/tutorials/bring_your_own_codegen.py``)
-     - 00:00.138
-     - 0.0
-   * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py` 
(``../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py``)
-     - 00:00.116
+     - 00:00.139
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py` 
(``../deep_dive/relax/tutorials/relax_creation.py``)
-     - 00:00.115
+     - 00:00.118
+     - 0.0
+   * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py` 
(``../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py``)
+     - 00:00.117
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py` 
(``../deep_dive/relax/tutorials/relax_transformation.py``)
-     - 00:00.053
+     - 00:00.055
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py` 
(``../deep_dive/tensor_ir/tutorials/meta_schedule.py``)
      - 00:00.007
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_mix_python_and_tvm_with_pymodule.py` 
(``../how_to/tutorials/mix_python_and_tvm_with_pymodule.py``)
-     - 00:00.004
+     - 00:00.005
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_export_and_load_executable.py` 
(``../how_to/tutorials/export_and_load_executable.py``)
      - 00:00.002
diff --git a/docs/deep_dive/relax/tutorials/sg_execution_times.html 
b/docs/deep_dive/relax/tutorials/sg_execution_times.html
index 64a96694bfb..552a3538b7f 100644
--- a/docs/deep_dive/relax/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/relax/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-deep-dive-relax-tutorials-sg-execution-times"></span><h1>Computation
 times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:00.168</strong> total execution time for 2 files <strong>from 
deep_dive/relax/tutorials</strong>:</p>
+<p><strong>00:00.173</strong> total execution time for 2 files <strong>from 
deep_dive/relax/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,11 +319,11 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
 class="std std-ref">Relax Creation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">relax_creation.py</span></code>)</p></td>
-<td><p>00:00.115</p></td>
+<td><p>00:00.118</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">relax_transformation.py</span></code>)</p></td>
-<td><p>00:00.053</p></td>
+<td><p>00:00.055</p></td>
 <td><p>0.0</p></td>
 </tr>
 </tbody>
diff --git a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html 
b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
index 2b3d36a8d55..1d0b5728bb2 100644
--- a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-deep-dive-tensor-ir-tutorials-sg-execution-times"></span><h1>Computation
 times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:00.574</strong> total execution time for 4 files <strong>from 
deep_dive/tensor_ir/tutorials</strong>:</p>
+<p><strong>00:00.583</strong> total execution time for 4 files <strong>from 
deep_dive/tensor_ir/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,15 +319,15 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.269</p></td>
+<td><p>00:00.272</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
 class="std std-ref">TensorIR Creation</span></a> (<code class="docutils 
literal notranslate"><span class="pre">tir_creation.py</span></code>)</p></td>
-<td><p>00:00.182</p></td>
+<td><p>00:00.186</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
 class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.116</p></td>
+<td><p>00:00.117</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
 class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">meta_schedule.py</span></code>)</p></td>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html 
b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
index 7a950e94bd5..68e6bdf405c 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
@@ -493,17 +493,17 @@ be used to ascertain the shape and data type of a 
TensorIR.</p>
 <span class="nb">print</span><span class="p">(</span><span 
class="n">evaluate_dynamic_shape</span><span class="p">(</span><span 
class="n">dyn_shape_lib</span><span class="p">,</span> <span 
class="n">m</span><span class="o">=</span><span class="mi">64</span><span 
class="p">,</span> <span class="n">n</span><span class="o">=</span><span 
class="mi">64</span><span class="p">,</span> <a 
href="../../../reference/api/python/tirx/tirx.html#tvm.tirx.IterVar" 
title="tvm.tirx.IterVar" class="sphx-gl [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[1.8622421  0.67286646 1.282362   
1.0511451 ]
- [2.0764782  0.5306615  1.3665953  1.4980947 ]
- [1.3380854  0.2942735  0.87282985 0.9297526 ]
- [1.9473305  0.71752125 1.2324657  1.3434483 ]]
-[[30.725405 27.933485 33.42631  ... 28.76254  32.434967 25.542963]
- [28.348495 25.882704 31.631708 ... 30.102102 32.523552 23.54295 ]
- [31.881264 32.651657 37.273735 ... 31.474817 35.480404 28.237606]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[0.8605255  0.5471485  0.8039893  
1.0729661 ]
+ [1.1034956  1.2353773  0.8464414  1.6922977 ]
+ [0.4042995  1.1864915  0.8314758  1.1582114 ]
+ [1.5709348  0.53600913 1.079339   1.77458   ]]
+[[31.28486  32.598347 34.754593 ... 34.229595 34.81411  33.876446]
+ [28.932312 32.171654 32.346653 ... 29.917728 31.682928 33.08734 ]
+ [28.484785 30.425142 32.392727 ... 31.919804 32.438267 30.258091]
  ...
- [27.833437 28.467497 31.901064 ... 28.129526 31.220873 25.711681]
- [32.942955 32.215992 36.025925 ... 32.050587 36.881718 29.219858]
- [28.208681 25.811222 31.636063 ... 26.905252 29.956907 23.83386 ]]
+ [30.73488  31.195097 33.93179  ... 31.291265 32.389244 34.30623 ]
+ [26.5504   29.39458  29.45073  ... 28.97972  31.037825 27.498674]
+ [28.01241  28.874033 29.861454 ... 30.790987 30.821604 29.927006]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html 
b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
index b39e98f9a47..0541547ed28 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
@@ -374,7 +374,7 @@ original implementation.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   2.5176       2.5176       2.5176       2.5176       0.0000
+   2.5877       2.5877       2.5877       2.5877       0.0000
 </pre></div>
 </div>
 <section id="initialization-schedule">
@@ -470,7 +470,7 @@ class Module:
 
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   0.8615       0.8615       0.8615       0.8615       0.0000
+   0.8656       0.8656       0.8656       0.8656       0.0000
 </pre></div>
 </div>
 </section>
@@ -564,7 +564,7 @@ class Module:
 
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   0.3430       0.3430       0.3430       0.3430       0.0000
+   0.3503       0.3503       0.3503       0.3503       0.0000
 </pre></div>
 </div>
 </section>
diff --git a/docs/get_started/tutorials/ir_module.html 
b/docs/get_started/tutorials/ir_module.html
index 0379016847f..5aee1548e75 100644
--- a/docs/get_started/tutorials/ir_module.html
+++ b/docs/get_started/tutorials/ir_module.html
@@ -809,16 +809,16 @@ backends.</p>
 <p>We can deploy the IRModule on CPU by specifying the target as <code 
class="docutils literal notranslate"><span class="pre">llvm</span></code>.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
 
 <span class="n">raw_data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span 
class="p">)</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;float32&quot;</span><span 
class="p">)</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">raw_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
-<span class="n">cpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#dict"; 
title="builtins.dict" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params_from_torch</span></a><span class="p">[</ [...]
+<span class="n">cpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https:// [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">cpu_out</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.08104634  0.00242352  0.1049604  
-0.08137349 -0.1480625  -0.08869022
-  -0.09602349  0.00385824 -0.15466967 -0.10502834]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.07247547  0.03547548 -0.10874876 
-0.06546346  0.12311915 -0.06342966
+   0.09690153  0.11260182 -0.07364957  0.16807812]]
 </pre></div>
 </div>
 </section>
@@ -841,19 +841,19 @@ the details of <code class="docutils literal 
notranslate"><span class="pre">DLig
 <p>Now we can compile the IRModule on GPU, the similar way as we did on 
CPU.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
 <span class="c1"># Need to allocate data and params on GPU device</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">raw_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a><span class="p">)</span><s [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https:// [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">gpu_out</span><span class="p">)</span>
 
 <span class="c1"># Check the correctness of the results</span>
 <span class="k">assert</span> <span class="n">np</span><span 
class="o">.</span><span class="n">allclose</span><span class="p">(</span><span 
class="n">cpu_out</span><span class="p">,</span> <span 
class="n">gpu_out</span><span class="p">,</span> <span 
class="n">atol</span><span class="o">=</span><span class="mf">1e-3</span><span 
class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.08104636  0.00242357  0.10496042 
-0.0813735  -0.14806242 -0.08869025
-  -0.09602351  0.00385825 -0.15466967 -0.1050284 ]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.07247549  0.03547547 -0.10874879 
-0.06546341  0.12311919 -0.06342966
+   0.09690155  0.11260177 -0.07364966  0.16807815]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/get_started/tutorials/quick_start.html 
b/docs/get_started/tutorials/quick_start.html
index cde94d81e4d..752a2001be8 100644
--- a/docs/get_started/tutorials/quick_start.html
+++ b/docs/get_started/tutorials/quick_start.html
@@ -452,16 +452,16 @@ different devices.</p>
 <a href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">target</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span 
class="o">.</span><span class="n">target< [...]
 <a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">compile</span [...]
 <span class="n">device</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span 
class="p">)</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;float32&quot;</span><span 
class="p">)</span>
 <span class="n">tvm_data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">data</span><span class="p">,</span> <span 
class="n">device</span><span class="o">=</span><span 
class="n">device</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">np</span><span class="o">.</span><span 
class="n">random</span><span class="o">.</span><span class="n">rand</span><span 
class="p">(</span><span class="o">*</span><span class="n">param</span><span 
class="o">.</sp [...]
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">param</span><span class="p">,</span> <span class="n"> [...]
-<span class="nb">print</span><span class="p">(</span><span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;forward&quot;</span><span class="p">](</span><span 
class="n">tvm_data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a><span class="p">)</span><s [...]
+<span class="nb">print</span><span class="p">(</span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;forward&quot;</span><span 
class="p">](</span><span class="n">tvm_data</span><span class="p">,</span> 
<span class="o">*</span><a href="http [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[23730.143 27612.406 26144.84  25774.523 
26365.133 24124.285 26677.041
-  25740.312 24821.139 24888.467]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[25344.477 25751.967 25929.48  26905.932 
26182.744 24975.156 25076.4
+  26959.822 25762.361 25306.086]]
 </pre></div>
 </div>
 <p>Our goal is to bring machine learning to the application with any language 
of interest,
@@ -469,8 +469,8 @@ with the minimum runtime support.</p>
 <ul>
 <li><p>Each function in IRModule becomes a runnable function in the runtime. 
For example in LLM
 cases, we can call <code class="docutils literal notranslate"><span 
class="pre">prefill</span></code> and <code class="docutils literal 
notranslate"><span class="pre">decode</span></code> functions directly.</p>
-<div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span 
class="o">=</span> <span class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;prefill&quot;</span><span class="p">](</span><span 
class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">)</span>
-<span class="n">decoded_logits</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;decode&quot;</span><span class="p">](</span><span 
class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">)</span>
+<div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;prefill&quot;</span><span 
class="p">](</span> [...]
+<span class="n">decoded_logits</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;decode&quot;</span><span 
class="p">](</span><span class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span>< [...]
 </pre></div>
 </div>
 </li>
@@ -485,15 +485,15 @@ copy exchange with existing ecosystem (DLPack exchange 
with PyTorch)</p>
 </li>
 <li><p>TVM runtime works in non-python environments, so it works on settings 
such as mobile</p>
 <div class="highlight-C++ notranslate"><div 
class="highlight"><pre><span></span><span class="c1">// C++ snippet</span>
-<span class="n">runtime</span><span class="o">::</span><span 
class="n">Module</span><span class="w"> </span><span class="n">vm</span><span 
class="w"> </span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a><span class="p">.</span><span class="n">GetFunction [...]
-<span class="n">vm</span><span class="p">.</span><span 
class="n">GetFunction</span><span class="p">(</span><span 
class="s">&quot;init&quot;</span><span class="p">)(...);</span>
-<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><span class="n">vm</span><span class="p">.</span><span 
class="n">GetFunction</span><span class="p">(</span><span 
class="s">&quot;prefill&quot;</span><span class="p">)(</span><span 
class="n">data</span><span class="p">,</span><span class="w"> </span><span 
class="n">weight</span><span class="p">,</span><span class="w"> </span><span 
class="n" [...]
+<span class="n">runtime</span><span class="o">::</span><span 
class="n">Module</span><span class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w"> 
</span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.r [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="n">GetFunction</span><span 
class="p">(</span><span class="s">&quot;init&quot;</span><span 
class="p">)(...);</span>
+<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="n">GetFunction</span><span 
class="p">(</span><span [...]
 </pre></div>
 </div>
 <div class="highlight-Java notranslate"><div 
class="highlight"><pre><span></span><span class="c1">// Java snippet</span>
-<span class="n">Module</span><span class="w"> </span><span 
class="n">vm</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span class="s">&quot;l 
[...]
-<span class="n">vm</span><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span 
class="s">&quot;init&quot;</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(...).</span><span 
class="na">invoke</span><span class="p">;</span>
-<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><span class="n">vm</span><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span 
class="s">&quot;prefill&quot;</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(</span><span 
class="n">data</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(</span><spa [...]
+<span class="n">Module</span><span class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w"> 
</span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="na">getFunction</span><span 
class="p">(</span><span class="s">&quot;init&quot;</span><span 
class="p">).</span><span class="na">pushArg</span><span 
class="p">(...).</span><span class="na">invoke</span>< [...]
+<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="na">getFunction</span><span 
class="p">(</span><spa [...]
 </pre></div>
 </div>
 </li>
diff --git a/docs/get_started/tutorials/sg_execution_times.html 
b/docs/get_started/tutorials/sg_execution_times.html
index f8d3189e154..1b24d248710 100644
--- a/docs/get_started/tutorials/sg_execution_times.html
+++ b/docs/get_started/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-get-started-tutorials-sg-execution-times"></span><h1>Computation 
times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:06.920</strong> total execution time for 2 files <strong>from 
get_started/tutorials</strong>:</p>
+<p><strong>00:07.593</strong> total execution time for 2 files <strong>from 
get_started/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,11 +319,11 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span 
class="std std-ref">IRModule</span></a> (<code class="docutils literal 
notranslate"><span class="pre">ir_module.py</span></code>)</p></td>
-<td><p>00:06.762</p></td>
+<td><p>00:07.439</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span 
class="std std-ref">Quick Start</span></a> (<code class="docutils literal 
notranslate"><span class="pre">quick_start.py</span></code>)</p></td>
-<td><p>00:00.158</p></td>
+<td><p>00:00.154</p></td>
 <td><p>0.0</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tutorials/bring_your_own_codegen.html 
b/docs/how_to/tutorials/bring_your_own_codegen.html
index 579a2969784..eb1c8354621 100644
--- a/docs/how_to/tutorials/bring_your_own_codegen.html
+++ b/docs/how_to/tutorials/bring_your_own_codegen.html
@@ -469,7 +469,7 @@ and <code class="docutils literal notranslate"><span 
class="pre">USE_EXAMPLE_NPU
     <span class="k">with</span> <a 
href="../../reference/api/python/transform.html#tvm.ir.transform.PassContext" 
title="tvm.ir.transform.PassContext" 
class="sphx-glr-backref-module-tvm-ir-transform 
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span 
class="o">.</span><span class="n">transform</span><span class="o">.</span><span 
class="n">PassContext</span></a><span class="p">(</span><span 
class="n">opt_level</span><span class="o">=</span><span 
class="mi">3</span><span class=" [...]
         <span class="n">built</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.build" 
title="tvm.relax.build" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-function"><span class="n">relax</span><span 
class="o">.</span><span class="n">build</span></a><span class="p">(</span><span 
class="n">mod</span><span class="p">,</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="s [...]
 
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">built</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span  [...]
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">built</span><span class="p">,</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cp [...]
     <span class="n">result</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">x_np</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span  
[...]
 
     <span class="k">assert</span> <span class="n">result</span><span 
class="o">.</span><span class="n">numpy</span><span class="p">()</span><span 
class="o">.</span><span class="n">shape</span> <span class="o">==</span> <span 
class="p">(</span><span class="mi">2</span><span class="p">,</span> <span 
class="mi">8</span><span class="p">)</span>
@@ -509,7 +509,7 @@ priority), both ops are offloaded as a single composite 
function.</p>
     <span class="n">x2_np</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">randn</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span 
class="mi">32</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><s [...]
     <span class="n">w2_np</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">randn</span><span class="p">(</span><span 
class="mi">16</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span 
class="mi">3</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><sp [...]
 
-    <span class="n">vm2</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">built2</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><spa [...]
+    <span class="n">vm2</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">built2</span><span class="p">,</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n"> [...]
     <span class="n">result2</span> <span class="o">=</span> <span 
class="n">vm2</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span>
         <span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">x2_np</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class=" [...]
     <span class="p">)</span>
diff --git a/docs/how_to/tutorials/cross_compilation_and_rpc.html 
b/docs/how_to/tutorials/cross_compilation_and_rpc.html
index 0672d256bb9..869bd07ad03 100644
--- a/docs/how_to/tutorials/cross_compilation_and_rpc.html
+++ b/docs/how_to/tutorials/cross_compilation_and_rpc.html
@@ -478,7 +478,7 @@ device and returns the measured cost. Network overhead is 
excluded.</p>
 <span class="nb">print</span><span class="p">(</span><span 
class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span 
class="n">cost</span><span class="si">:</span><span class="s2">g</span><span 
class="si">}</span><span class="s2"> secs/op&quot;</span><span 
class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>9.1e-08 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>1.08e-07 secs/op
 </pre></div>
 </div>
 </section>
@@ -846,8 +846,8 @@ for ONNX models. Simply replace <code class="docutils 
literal notranslate"><span
 Converted PyTorch model to Relax:
   - Number of parameters: 4
 Using local target for demonstration
-Exported library to: /tmp/tmpms6e2cvm/model_deployed.so
-Saved parameters to: /tmp/tmpms6e2cvm/model_params.npz
+Exported library to: /tmp/tmpgixoiye1/model_deployed.so
+Saved parameters to: /tmp/tmpgixoiye1/model_params.npz
 
 RPC workflow (works for any remote device):
 ==================================================
diff --git a/docs/how_to/tutorials/customize_opt.html 
b/docs/how_to/tutorials/customize_opt.html
index a4f0dd4c3c0..99b2e37ecc1 100644
--- a/docs/how_to/tutorials/customize_opt.html
+++ b/docs/how_to/tutorials/customize_opt.html
@@ -612,16 +612,16 @@ pushing the performance to the limit. The current 
optimization may not be the be
 <p>We can build and deploy the optimized model to the TVM runtime.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-functi [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
 <span class="c1"># Need to allocate data and params on GPU device</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-ba [...]
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">np</span><span 
class="o">.</span><span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;forward&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a><span class="p">)</span [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;forward&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">gpu_out</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[25077.996 26146.457 26063.613 23740.695 
24852.195 25097.496 23488.072
-  23641.996 26411.695 24865.14 ]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[26192.203 26975.654 25755.906 25443.984 
27058.223 26124.715 26480.742
+  24814.836 26310.633 25108.781]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/how_to/tutorials/e2e_opt_model.html 
b/docs/how_to/tutorials/e2e_opt_model.html
index 6ace44484fe..912a560dfc0 100644
--- a/docs/how_to/tutorials/e2e_opt_model.html
+++ b/docs/how_to/tutorials/e2e_opt_model.html
@@ -332,8 +332,8 @@ PyTorch.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>Downloading: 
&quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to 
/workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
- 69%|██████▉   | 31.0M/44.7M [00:00&lt;00:00, 324MB/s]
-100%|██████████| 44.7M/44.7M [00:00&lt;00:00, 323MB/s]
+ 60%|█████▉    | 26.6M/44.7M [00:00&lt;00:00, 279MB/s]
+100%|██████████| 44.7M/44.7M [00:00&lt;00:00, 313MB/s]
 </pre></div>
 </div>
 </section>
@@ -434,7 +434,7 @@ We skip this step in the CI environment.</p>
         <span class="n">mod</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">s_tir</span><span 
class="o">.</span><span class="n">transform</span><span class="o">.</span><span 
class="n">DefaultGPUSchedule</span><span class="p">()(</span><span 
class="n">mod</span><span class="p">)</span>
     <span class="n">ex</span> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span 
class="n">compile</span></a><span class="p">(</span><span 
class="n">mod</span><span class="p">,</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm [...]
     <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">ex</span><span class="p">,</span> <span class="n">dev</span><span 
class="p">)</span>
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">ex</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
     <span class="c1"># Need to allocate data and params on GPU device</span>
     <span class="n">gpu_data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">224< [...]
     <span class="n">gpu_params</span> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n">dev</span><span class="p">)</span> <span 
class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span 
class="n">params</span><span class="p">[</span><span class="s2" [...]
diff --git a/docs/how_to/tutorials/export_and_load_executable.html 
b/docs/how_to/tutorials/export_and_load_executable.html
index 204c46bce3b..27690289d69 100644
--- a/docs/how_to/tutorials/export_and_load_executable.html
+++ b/docs/how_to/tutorials/export_and_load_executable.html
@@ -446,7 +446,7 @@ runtime module directly.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="k">if</span> <a 
href="https://docs.python.org/3/library/functions.html#bool"; 
title="builtins.bool" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">RUN_EXAMPLE</span></a><span class="p">:</span>
     <span class="n">loaded_rt_mod</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">load_module</span><span 
class="p">(</span><span class="nb">str</span><span class="p">(</span><span 
class="n">library_path</span><span class="p">))</span>
     <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">loaded_rt_mod</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">loaded_rt_mod</span><span class="p">,</span> 
<span class="n">dev</span><span class="p">)</span>
 
     <span class="c1"># Prepare input data</span>
     <span class="n">input_tensor</span> <span class="o">=</span> <span 
class="n">torch</span><span class="o">.</span><span class="n">randn</span><span 
class="p">(</span><span class="mi">1</span><span class="p">,</span> <span 
class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span 
class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span 
class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span 
class="o">.</span><span class="n">f [...]
@@ -527,7 +527,7 @@ of how to reload and run the model. Save this as <code 
class="docutils literal n
 
 <span class="c1"># Step 2: Create Virtual Machine</span>
 <span class="n">device</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span 
class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">device</span><span class="p">)</span>
 
 <span class="c1"># Step 3: Load parameters from the .npz file</span>
 <span class="n">params_npz</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">load</span><span 
class="p">(</span><span 
class="s2">&quot;relax_export_artifacts/model_params.npz&quot;</span><span 
class="p">)</span>
@@ -562,7 +562,7 @@ To run on GPU instead of CPU, make the following 
changes:</p>
 </li>
 <li><p><strong>Use GPU device in the script</strong>:</p>
 <div class="highlight-python notranslate"><div 
class="highlight"><pre><span></span><span class="n">device</span> <span 
class="o">=</span> <span class="n">tvm</span><span class="o">.</span><span 
class="n">cuda</span><span class="p">(</span><span class="mi">0</span><span 
class="p">)</span>  <span class="c1"># Use CUDA device instead of CPU</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span 
class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">device</span><span class="p">)</span>
 
 <span class="c1"># Load parameters to GPU</span>
 <span class="n">params</span> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">params_npz</span><span class="p">[</span><span 
class="sa">f</span><span class="s2">&quot;p_</span><span 
class="si">{</span><span class="n">i</span><span class="si">}</span><span 
class="s2">&quot;</span><span class="p">],</span> <span class= [...]
@@ -627,7 +627,7 @@ for a comprehensive guide on:</p>
 
 <span class="c1"># Step 4: Load and run on remote device</span>
 <span class="n">lib</span> <span class="o">=</span> <span 
class="n">remote</span><span class="o">.</span><span 
class="n">load_module</span><span class="p">(</span><span 
class="s2">&quot;mlp_arm.so&quot;</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">remote</span><span 
class="o">.</span><span cla [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">remote</span><span class="o">.</span><span class="n">cpu</ [...]
 <span class="c1"># ... prepare input and params, then run inference</span>
 </pre></div>
 </div>
diff --git a/docs/how_to/tutorials/import_model.html 
b/docs/how_to/tutorials/import_model.html
index e2c4524a1f2..42b5e3d7cc8 100644
--- a/docs/how_to/tutorials/import_model.html
+++ b/docs/how_to/tutorials/import_model.html
@@ -530,13 +530,13 @@ shown below.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">mod_compiled</span> <span 
class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.get_pipeline" 
title="tvm.relax.get_pipeline" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-function"><span class="n">relax</span><span 
class="o">.</span><span class="n">get_pipeline</span></a><span 
class="p">(</span><span class="s2">&quot;zero&quot;</span><span cla [...]
 <a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec_module</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">comp [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
+<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
 
 <span class="c1"># Run inference</span>
 <span class="n">input_data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span 
class="mi">32</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><s [...]
 <span class="n">tvm_input</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">input_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">tvm_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n"> [...]
-<span class="n">tvm_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">tvm_input</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">tvm_params</span></a><span class="p">)</sp [...]
+<span class="n">tvm_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">tvm_input</span><span class="p">,</span> 
<span class="o">*</span><a href="htt [...]
 
 <span class="c1"># Compare with PyTorch</span>
 <span class="k">with</span> <span class="n">torch</span><span 
class="o">.</span><span class="n">no_grad</span><span class="p">():</span>
diff --git a/docs/how_to/tutorials/optimize_llm.html 
b/docs/how_to/tutorials/optimize_llm.html
index f958b164a2a..82f9bfc8181 100644
--- a/docs/how_to/tutorials/optimize_llm.html
+++ b/docs/how_to/tutorials/optimize_llm.html
@@ -729,7 +729,7 @@ is designed specifically for the LLMs.</p>
 
 <span class="k">with</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">target</span></a><span class="p">:</span>
     <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">compile</ [...]
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" c [...]
+    <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span cla [...]
 </pre></div>
 </div>
 </section>
@@ -827,7 +827,7 @@ the model documentation for the correct tokenization and 
prompt format.</p>
 key and value tensors for the attention layer. Apache TVM provides a 
PagedKVCache to store the
 key and value tensors. We create the PagedKVCache with the specified 
parameters.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="k">if</span> <span 
class="ow">not</span> <a 
href="https://docs.python.org/3/library/functions.html#bool"; 
title="builtins.bool" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">IS_IN_CI</span></a><span class="p">:</span>
-    <span class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;create_tir_paged_kv_cache&quot;</span><span class="p">](</span>
+    <span class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span 
class="s2">&quot;create_tir_paged_kv_cache&quot;</span><span class="p">](</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">1</span><span class="p">]),</span>  <span 
class="c1"># max_batch_size=1</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>  
<span class="c1"># max_total_seq_len=2048</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>  
<span class="c1"># prefill_chunk_size=2048</span>
@@ -844,7 +844,7 @@ compiled in the Relax IRModule to embed the tokens into the 
hidden states.</p>
 
 
 <span class="k">def</span><span class="w"> </span><span 
class="nf">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">):</span>
-    <span class="n">_embed</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;embed&quot;</span><span class="p">](</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
+    <span class="n">_embed</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;embed&quot;</span><span 
class="p">](</span><span class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span  [...]
     <span class="c1"># Reshape hidden from [seq_len, hidden_size] to [1, 
seq_len, hidden_size]</span>
     <span class="n">_embed</span> <span class="o">=</span> <span 
class="n">nd_view_func</span><span class="p">(</span><span 
class="n">_embed</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">1</span><span class="p">,</span> <span 
class="n">_embed</span><span class="o">.</span>< [...]
     <span class="k">return</span> <span class="n">_embed</span>
@@ -867,7 +867,7 @@ and <cite>end_forward_func</cite> to end the forward 
pass.</p>
     <span class="n">add_sequence_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">seq_id</span><span class="p">)</span>
     <span class="n">hidden_states</span> <span class="o">=</span> <span 
class="n">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
     <span class="n">begin_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="s [...]
-    <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;prefill&quot;</span><span class="p">](</span><span 
class="n">hidden_states</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
+    <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;prefill&quot;</span><span 
class="p">](</span><span class="n">hidden_states</ [...]
     <span class="n">end_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">)</span>
 </pre></div>
 </div>
@@ -899,7 +899,7 @@ IRModule to generate the token.</p>
         <span class="n">tokens</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">array</span><span 
class="p">([</span><span class="n">last_token</span><span 
class="p">])</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;int32&quot;< [...]
         <span class="n">hidden_states</span> <span class="o">=</span> <span 
class="n">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
         <span class="n">begin_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" clas [...]
-        <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;decode&quot;</span><span class="p">](</span><span 
class="n">hidden_states</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
+        <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;decode&quot;</span><span 
class="p">](</span><span class="n">hidden_state [...]
 
         <span class="n">end_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">)</span>
         <span class="n">last_token</span> <span class="o">=</span> <span 
class="n">sample_token</span><span class="p">(</span><span 
class="n">logits</span><span class="p">)</span>
diff --git a/docs/how_to/tutorials/sg_execution_times.html 
b/docs/how_to/tutorials/sg_execution_times.html
index 0ea6aab9e59..0947d72e7be 100644
--- a/docs/how_to/tutorials/sg_execution_times.html
+++ b/docs/how_to/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span id="sphx-glr-how-to-tutorials-sg-execution-times"></span><h1>Computation 
times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:15.095</strong> total execution time for 8 files <strong>from 
how_to/tutorials</strong>:</p>
+<p><strong>00:15.246</strong> total execution time for 8 files <strong>from 
how_to/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,31 +319,31 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span 
class="std std-ref">Optimize Large Language Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.057</p></td>
+<td><p>00:10.085</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span 
class="std std-ref">Importing Models from ML Frameworks</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">import_model.py</span></code>)</p></td>
-<td><p>00:03.313</p></td>
+<td><p>00:03.352</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span 
class="std std-ref">End-to-End Optimize Model</span></a> (<code class="docutils 
literal notranslate"><span class="pre">e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.570</p></td>
+<td><p>00:00.603</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span 
class="std std-ref">Customize Optimization</span></a> (<code class="docutils 
literal notranslate"><span class="pre">customize_opt.py</span></code>)</p></td>
-<td><p>00:00.536</p></td>
+<td><p>00:00.585</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
 class="std std-ref">Cross Compilation and RPC</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.475</p></td>
+<td><p>00:00.474</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
 class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.138</p></td>
+<td><p>00:00.139</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="mix_python_and_tvm_with_pymodule.html#sphx-glr-how-to-tutorials-mix-python-and-tvm-with-pymodule-py"><span
 class="std std-ref">Mix Python/PyTorch with TVM Using BasePyModule</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">mix_python_and_tvm_with_pymodule.py</span></code>)</p></td>
-<td><p>00:00.004</p></td>
+<td><p>00:00.005</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="export_and_load_executable.html#sphx-glr-how-to-tutorials-export-and-load-executable-py"><span
 class="std std-ref">Export and Load Relax Executables</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">export_and_load_executable.py</span></code>)</p></td>
diff --git a/docs/objects.inv b/docs/objects.inv
index 0064a262146..ac69ad0f869 100644
Binary files a/docs/objects.inv and b/docs/objects.inv differ
diff --git a/docs/reference/api/python/relax/training.html 
b/docs/reference/api/python/relax/training.html
index fdc8525d243..07ea14a12ad 100644
--- a/docs/reference/api/python/relax/training.html
+++ b/docs/reference/api/python/relax/training.html
@@ -395,7 +395,7 @@ relax.transform.AppendLoss.</p></li>
 
 <dl class="py class">
 <dt class="sig sig-object py" id="tvm.relax.training.Trainer">
-<em class="property"><span class="pre">class</span><span class="w"> 
</span></em><span class="sig-prename descclassname"><span 
class="pre">tvm.relax.training.</span></span><span class="sig-name 
descname"><span class="pre">Trainer</span></span><span 
class="sig-paren">(</span><em class="sig-param"><span class="n"><span 
class="pre">train_mod</span></span><span class="p"><span 
class="pre">:</span></span><span class="w"> </span><span class="n"><a 
class="reference internal" href="../ir.html#tvm [...]
+<em class="property"><span class="pre">class</span><span class="w"> 
</span></em><span class="sig-prename descclassname"><span 
class="pre">tvm.relax.training.</span></span><span class="sig-name 
descname"><span class="pre">Trainer</span></span><span 
class="sig-paren">(</span><em class="sig-param"><span class="n"><span 
class="pre">train_mod</span></span><span class="p"><span 
class="pre">:</span></span><span class="w"> </span><span class="n"><a 
class="reference internal" href="../ir.html#tvm [...]
 <dd><p>Unified wrapper for relax training. It accepts the IRModule (that is 
the result of
 SetupTrainer) and the relax VM (that contains the built result of the 
IRModule), and helps run
 the VM. It maintains the parameters, the model states and the optimizer states 
internally.</p>
diff --git a/docs/reference/api/python/runtime/vm.html 
b/docs/reference/api/python/runtime/vm.html
index 350bd6f2551..7721b0a8b6a 100644
--- a/docs/reference/api/python/runtime/vm.html
+++ b/docs/reference/api/python/runtime/vm.html
@@ -509,7 +509,7 @@ more details.</p>
 <div class="admonition seealso">
 <p class="admonition-title">See also</p>
 <dl class="simple">
-<dt><a class="reference internal" 
href="../relax/relax.html#tvm.relax.VMInstrumentReturnKind" 
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj 
docutils literal notranslate"><span 
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible 
return values in VM.</p>
+<dt><a class="reference internal" 
href="#tvm.runtime.vm.VMInstrumentReturnKind" 
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj 
docutils literal notranslate"><span 
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible 
return values in VM.</p>
 </dd>
 </dl>
 </div>
diff --git a/docs/searchindex.js b/docs/searchindex.js
index ce6b6ce50a4..efb505fb1d2 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Cross Compile TVM Runtime": [[48, 
"cross-compile-tvm-runtime"]], "1. The lack of numpy on device machine caused 
the RPC server can\u2019t be launched.": [[48, 
"the-lack-of-numpy-on-device-machine-caused-the-rpc-server-can-t-be-launched"]],
 "2. Pack and Deploy to Device Machine": [[48, 
"pack-and-deploy-to-device-machine"]], "2. The lack of cloudpickle on device 
machine caused the RPC server can\u2019t be launched.": [[48, 
"the-lack-of-cloudpickle-on-devi [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Cross Compile TVM Runtime": [[48, 
"cross-compile-tvm-runtime"]], "1. The lack of numpy on device machine caused 
the RPC server can\u2019t be launched.": [[48, 
"the-lack-of-numpy-on-device-machine-caused-the-rpc-server-can-t-be-launched"]],
 "2. Pack and Deploy to Device Machine": [[48, 
"pack-and-deploy-to-device-machine"]], "2. The lack of cloudpickle on device 
machine caused the RPC server can\u2019t be launched.": [[48, 
"the-lack-of-cloudpickle-on-devi [...]
\ No newline at end of file
diff --git a/docs/sg_execution_times.html b/docs/sg_execution_times.html
index 44db239efca..28d5773f645 100644
--- a/docs/sg_execution_times.html
+++ b/docs/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span id="sphx-glr-sg-execution-times"></span><h1>Computation times<a 
class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:22.757</strong> total execution time for 16 files <strong>from 
all galleries</strong>:</p>
+<p><strong>00:23.594</strong> total execution time for 16 files <strong>from 
all galleries</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,55 +319,55 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
 class="std std-ref">Optimize Large Language Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.057</p></td>
+<td><p>00:10.085</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="get_started/tutorials/ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
 class="std std-ref">IRModule</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../get_started/tutorials/ir_module.py</span></code>)</p></td>
-<td><p>00:06.762</p></td>
+<td><p>00:07.439</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span
 class="std std-ref">Importing Models from ML Frameworks</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/import_model.py</span></code>)</p></td>
-<td><p>00:03.313</p></td>
+<td><p>00:03.352</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
 class="std std-ref">End-to-End Optimize Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.570</p></td>
+<td><p>00:00.603</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
 class="std std-ref">Customize Optimization</span></a> (<code class="docutils 
literal notranslate"><span 
class="pre">../how_to/tutorials/customize_opt.py</span></code>)</p></td>
-<td><p>00:00.536</p></td>
+<td><p>00:00.585</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
 class="std std-ref">Cross Compilation and RPC</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.475</p></td>
+<td><p>00:00.474</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.269</p></td>
+<td><p>00:00.272</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
 class="std std-ref">TensorIR Creation</span></a> (<code class="docutils 
literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/tir_creation.py</span></code>)</p></td>
-<td><p>00:00.182</p></td>
+<td><p>00:00.186</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="get_started/tutorials/quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
 class="std std-ref">Quick Start</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../get_started/tutorials/quick_start.py</span></code>)</p></td>
-<td><p>00:00.158</p></td>
+<td><p>00:00.154</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
 class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.138</p></td>
+<td><p>00:00.139</p></td>
 <td><p>0.0</p></td>
 </tr>
-<tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
 class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.116</p></td>
+<tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/relax/tutorials/relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
 class="std std-ref">Relax Creation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/relax/tutorials/relax_creation.py</span></code>)</p></td>
+<td><p>00:00.118</p></td>
 <td><p>0.0</p></td>
 </tr>
-<tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/relax/tutorials/relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
 class="std std-ref">Relax Creation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/relax/tutorials/relax_creation.py</span></code>)</p></td>
-<td><p>00:00.115</p></td>
+<tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
 class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py</span></code>)</p></td>
+<td><p>00:00.117</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/relax/tutorials/relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/relax/tutorials/relax_transformation.py</span></code>)</p></td>
-<td><p>00:00.053</p></td>
+<td><p>00:00.055</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
 class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/meta_schedule.py</span></code>)</p></td>
@@ -375,7 +375,7 @@ $(document).ready( function () {
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/mix_python_and_tvm_with_pymodule.html#sphx-glr-how-to-tutorials-mix-python-and-tvm-with-pymodule-py"><span
 class="std std-ref">Mix Python/PyTorch with TVM Using BasePyModule</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/mix_python_and_tvm_with_pymodule.py</span></code>)</p></td>
-<td><p>00:00.004</p></td>
+<td><p>00:00.005</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/export_and_load_executable.html#sphx-glr-how-to-tutorials-export-and-load-executable-py"><span
 class="std std-ref">Export and Load Relax Executables</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/export_and_load_executable.py</span></code>)</p></td>

Reply via email to