# Introduction The TVM community has worked since the v0.11.1 release to deliver the following new exciting improvements! The main tags are below (**bold text is with lots of progress**):
- Community, RFC; - Runtime: ACL(ArmComputeLibrary), Adreno, OpenCL & CLML, ROCm, CUDA & CUTLASS & TensorRT, Ethosn, CRT, Hexagon, Metal, Web & WASM, others about runtime; - Frontend: TensorFlow/tflite, Pytorch/Torch, Paddle, OneFlow, keras; - TE, Relay, BYOC, TOPI, Arith, **TIR, TVMScript, MetaSchedule**, Schedule; - CI, Tests, BugFix, Docs, Docker, Build; - Android, **microTVM**, Target, AutoTVM, AOT, LLVM. Please visit the full listing of commits for a complete view: [v0.11.1...v0.12.0](https://github.com/apache/tvm/compare/v0.11.1...v0.12.0). # Community - Reviewer - [Cheng Wen](https://github.com/apache/tvm/pull/14153) - [blackkker](https://github.com/apache/tvm/pull/13686) - [Min Chen](https://github.com/apache/tvm/pull/13628) - [janCommunityetsc](https://github.com/apache/tvm/pull/14359) - [mkatanbaf](https://github.com/apache/tvm/pull/14085) - [alanmacd](https://github.com/apache/tvm/pull/13814) - Committer - [Yaxing Cai](https://github.com/apache/tvm/pull/13787) - [Hongyi Jin](https://github.com/apache/tvm/pull/13784) - PMC - [Wrongtest](https://github.com/apache/tvm/pull/13893) # RFC * [[RFC] Introduce PresburgerSet (#99)](https://github.com/apache/tvm-rfcs/blob/main/rfcs/0099-introduce-PresburgerSet.md) ([`e17994b`](https://github.com/apache/tvm-rfcs/commit/e17994b90d9278a280d019f3c8ad9065c9a3f584)) * [[RFC] Further Unify Packed and Object in TVM Runtime (#97)](https://github.com/apache/tvm-rfcs/blob/main/rfcs/0097-unify-packed-and-object.md) ([`d646a22`](https://github.com/apache/tvm-rfcs/commit/d646a22eb00b8138573cb856edb16a7b05906e1e)) ---- # Runtime ## ArmComputeLibrary - [[ACL][TESTING] Use pytest.mark.parametrize in ACL conv2d tests](https://github.com/apache/tvm/pull/14011) - [[ACL] Prevent offloading of per-channel quantized operators](https://github.com/apache/tvm/pull/14484) - [[CL] Update Compute Library from v22.11 to v23.02.1](https://github.com/apache/tvm/pull/14426) ## Adreno - [[Adreno] Extend pack_filter for HWIO layout](https://github.com/apache/tvm/pull/13939) - [[Adreno] Update interface of AnnotateMemoryScope pass](https://github.com/apache/tvm/pull/13779) - [[Adreno] Optimize reduction schedule](https://github.com/apache/tvm/pull/13781) - [[BENCHMARK][ADRENO] Adreno Benchmarks with texture](https://github.com/apache/tvm/pull/13675) - [[BENCHMARKS][CLML] Adreno benchmarks with CLML BYOC path added](https://github.com/apache/tvm/pull/13696) - [[BENCHMARKS][ADRENO] Documentation for Adreno (Texture) benchmarks](https://github.com/apache/tvm/pull/13679) - [[DOCS][ADRENO] Improved Adreno documentation](https://github.com/apache/tvm/pull/13867) ## OpenCL & CLML - OpenCL - [[OpenCL][Textures] Always use SSA for texture loading](https://github.com/apache/tvm/pull/14397) - [[OpenCL] Refactor OpenCL init function](https://github.com/apache/tvm/pull/13919) - [[OpenCL] Implement save/load pre-compiled programs](https://github.com/apache/tvm/pull/13868) - [[CMake][OpenCL] Remove warning for OpenCL wrapper](https://github.com/apache/tvm/pull/13683) - [[RUNTIME][OPENCL] OpenCL host pointer support to acheive zero copy](https://github.com/apache/tvm/pull/13413) - CLML - [[CLML][RUNTIME] Enable more ops in CLML runtime](https://github.com/apache/tvm/pull/13834) - [[CLML][RELAY] Enable Pad and Conv2d layer fusion](https://github.com/apache/tvm/pull/13649) - [[CLML][CODEGEN] CLML native codegen utility](https://github.com/apache/tvm/pull/13837) - [[CLML] Version compatibility and various test cases](https://github.com/apache/tvm/pull/13670) - [[CLML] Changes corresponding to OpenCL workspace refactorization](https://github.com/apache/tvm/pull/13972) - [[RUNTIME][CLML] OpenCLML tuning and profiling enhanced](https://github.com/apache/tvm/pull/13843) ## ROCm - [[ROCM] Fixes compiling on ROCM 5 and accuracy on dense op](https://github.com/apache/tvm/pull/13847) ## CMSIS-NN - [[CMSIS-NN] Global function that provides range based on dtype](https://github.com/apache/tvm/pull/13652) - [[CMSIS-NN] Add int16 add and mul operator support](https://github.com/apache/tvm/pull/13920) - [[CMSIS-NN] Add a runtime error message](https://github.com/apache/tvm/pull/13643) - [[CMSIS-NN] Reduction in code size of AOT test runner binary](https://github.com/apache/tvm/pull/13815) - [[CMSIS-NN] Remove support for the old CMSIS NN project](https://github.com/apache/tvm/pull/13760) - [[CMSIS-NN] Support CMSIS NN from new GitHub location](https://github.com/apache/tvm/pull/13656) - [[CMSIS-NN] Add Cortex-M85 support](https://github.com/apache/tvm/pull/13644) ## CUDA & CUTLASS & TensorRT - [[CUDA][Schedule] Better Layout Transform Schedules](https://github.com/apache/tvm/pull/14167) - [[Profiler] Allow user to flush L2 cache in `time_evalutor` function for profiling CUDA kernels](https://github.com/apache/tvm/pull/13726) - [[Codegen][CUDA] Add error message for missing fragment info](https://github.com/apache/tvm/pull/14073) - [[CUTLASS][Ansor] Combine CUTLASS and Ansor](https://github.com/apache/tvm/pull/13879) - [[TensorRT] Fix BiasAdd with correct axis attribute](https://github.com/apache/tvm/pull/13953) - [[TRT][BYOC] allow strided_slice ops on selected dimensions (#14142)](https://github.com/apache/tvm/pull/14144) ## Ethosn - [[ETHOSN] Update driver stack version to 22.11](https://github.com/apache/tvm/pull/13637) - [[ETHOSN] Support for addition with constant input](https://github.com/apache/tvm/pull/13931) - [[ETHOSN] Apply FoldConstant before NPU partitioning](https://github.com/apache/tvm/pull/13848) - [[ETHOSN] Remove support for NPU driver 22.08](https://github.com/apache/tvm/pull/13763) - [[ETHOSN] Fix for the mock inference after NPU driver update](https://github.com/apache/tvm/pull/13650) - [[ETHOSN] Remove requantize dependency on resize](https://github.com/apache/tvm/pull/14422) - [[ETHOSN] Add support for experimental compiler option](https://github.com/apache/tvm/pull/13410) ## CRT - [[CRT] USE CMake for CRT standalone libraries](https://github.com/apache/tvm/pull/14025) - [[CRT][microTVM] Enable USMP by default for AoTExecutor + CRT runtime](https://github.com/apache/tvm/pull/14107) - [[CRT]Cleanup unused macros in crt_config.h.template](https://github.com/apache/tvm/pull/14125) ## Hexagon - [[Hexagon][TOPI] Use IndexMap axis separator instead of TE](https://github.com/apache/tvm/pull/14459) - [[Hexagon] Add concept of DMA groups](https://github.com/apache/tvm/pull/14254) - [[Hexagon] Improve cache management strategy for HexagonBuffer](https://github.com/apache/tvm/pull/13883) - [[Hexagon] Denote DMA cache bypass as experimental feature](https://github.com/apache/tvm/pull/13699) - [[Hexagon] Adapt some intrinsics for high vector lanes](https://github.com/apache/tvm/pull/14345) - [Hexagon compilation on MacOS system](https://github.com/apache/tvm/pull/14308) - [[Hexagon] Enable depthwise conv2d NHWC with an HWIO kernel layout](https://github.com/apache/tvm/pull/13414) - [[Hexagon][QNN] Improve performance wo QNN canonicalization](https://github.com/apache/tvm/pull/13734) - [[Hexagon][Metaschedule] Add timeout_sec arg to get_hexagon_local_builder](https://github.com/apache/tvm/pull/13828) - [[Hexagon] Fix deprecated call for data layout size in bits](https://github.com/apache/tvm/pull/14438) - [[Hexagon] Allow scalar tensors to have null shape during allocation](https://github.com/apache/tvm/pull/14376) - [[Hexagon][runtime] Make HexagonThreadManager::CheckSemaphore thread safe](https://github.com/apache/tvm/pull/13609) - [[Hexagon] Float and quantized dense operators with schedules](https://github.com/apache/tvm/pull/12873) - [[Hexagon][CI] Updated sha for builder LLVM](https://github.com/apache/tvm/pull/13418) - [[Hexagon][CI] Update the docker image ID to reflect newer LLVM](https://github.com/apache/tvm/pull/13870) - [[Hexagon] Switch from default_rng to random in Hexagon tests](https://github.com/apache/tvm/pull/13616) - [[Hexagon] Add hexagon user DMA intrins for tensorization](https://github.com/apache/tvm/pull/13719) - [[hexagon] Hexagon inference fix](https://github.com/apache/tvm/pull/14533) ## Metal - [[METAL][CODEGEN] testcase for ramp codegen](https://github.com/apache/tvm/pull/14331) - [[CODEGEN][METAL] Fix unaligned vector load](https://github.com/apache/tvm/pull/14332) - [[CODEGEN][METAL] Fix ramp codegen](https://github.com/apache/tvm/pull/14330) ## MicroNPU - [[microNPU] Sum legalization support](https://github.com/apache/tvm/pull/13997) - [[microNPU] Add rescale parameters for binary elementwise](https://github.com/apache/tvm/pull/13890) - [[microNPU] Add hardware constraints for binary elementwise](https://github.com/apache/tvm/pull/13772) - [[microNPU] Add support for TFLite PAD](https://github.com/apache/tvm/pull/13732) - [[microNPU] Upgrade Vela to v3.7.0](https://github.com/apache/tvm/pull/14374) - [[microNPU] Merge LUT activation with binary elementwise operation](https://github.com/apache/tvm/pull/13935) - [[microNPU] Upgrade to 22.08 version of Arm(R) Ethos(TM)-U NPU drivers](https://github.com/apache/tvm/pull/13529) - [[microNPU] Add relu6 relu_n1_to_1 test cases for Ethos-U](https://github.com/apache/tvm/pull/13645) - [[microNPU] Add a legalization test for TFLite PAD](https://github.com/apache/tvm/pull/13750) - [[microNPU] Disable copying weights to SRAM for FullyConnected ops in CopyConstants scheduler](https://github.com/apache/tvm/pull/13588) - [[microNPU] Add support for ResizeNearestNeighbor with half_pixel_centers=True](https://github.com/apache/tvm/pull/14401) ## Web & WASM - [[Web] Try to upgrade WebGPU API usage to the latest](https://github.com/apache/tvm/pull/13731) - [[WEB] Reduce memleak in web runtime](https://github.com/apache/tvm/pull/14086) - [[WEB] WebGPU Codegen](https://github.com/apache/tvm/pull/14048) - [[WEB] Update web runtime to support latest emcc](https://github.com/apache/tvm/pull/14046) - [[WASM][FIX] test tests/node/websock_rpc_test.py](https://github.com/apache/tvm/pull/13862) ## Others about Runtime - [[FIX][RUNTIME] Convert container with function value type](https://github.com/apache/tvm/pull/14024) - [[RUNTIME] Fix the manual determination of cores in FillDataForMeasure](https://github.com/apache/tvm/pull/13849) - [[RUNTIME] Fix determination of big/little cores domains](https://github.com/apache/tvm/pull/13832) - [[Runtime] Fix Potential DeviceAPIManager Memory Bug](https://github.com/apache/tvm/pull/14114) - [[Runtime] Fix high RAM usage when saving / loading paramters of big models](https://github.com/apache/tvm/pull/14147) - [[Runtime] Runtime module property mask for Metal and Vulkan](https://github.com/apache/tvm/pull/14524) - [[Runtime] Introduce runtime module property](https://github.com/apache/tvm/pull/14406) - [[Runtime] Add missing Type2Str for TVMByteArray](https://github.com/apache/tvm/pull/14051) # Android - [[Android] Fix using system libraries in Android apps](https://github.com/apache/tvm/pull/14145) - [[TOOL][NATIVE] Android native application for deploy and run](https://github.com/apache/tvm/pull/13791) # AOT - [[AOT] Added a test for detecting output size post MLF export](https://github.com/apache/tvm/pull/13655) - [[AOT]Aot module post-test error workaround](https://github.com/apache/tvm/pull/13685) - [[AOT]Raise error when input name is not valid](https://github.com/apache/tvm/pull/14322) - [[AoT]Add get_input_name function to AoT Module](https://github.com/apache/tvm/pull/14071) # Arith - ["[Arith] Simplifications for floormod(x](https://github.com/apache/tvm/pull/13936) - [[Arith] Implemented PMatchesOneOf and matches_one_of](https://github.com/apache/tvm/pull/13933) - [[Arith][UnitTest] Parametrize tests of RewriteSimplifier](https://github.com/apache/tvm/pull/13923) - [[Arith] Use ConstIntBound to remove negative numerator when lowering](https://github.com/apache/tvm/pull/13724) - ["[Arith][Bugfix] Simplify ""x - 1 < y"" into ""x <= y"""](https://github.com/apache/tvm/pull/14528) - ["[Arith] Add simplification rule for `x - max(x+y](https://github.com/apache/tvm/pull/14271) - [[Arith] Updated incorrect simplification rule](https://github.com/apache/tvm/pull/13922) - [[Arith] Allow const folding on fp16 involving one and zero](https://github.com/apache/tvm/pull/13631) - [](https://github.com/apache/tvm/pull/13918) - [[ARITH] Enhance CanProve to handle symbolic bound](https://github.com/apache/tvm/pull/14523) - [[ARITH] support floordiv in deduce bound](https://github.com/apache/tvm/pull/13880) - [[Arith] Support eq in detect_clip_bound](https://github.com/apache/tvm/pull/13746) - [[Fix][Arith] Analyzer simplification starts with canonical](https://github.com/apache/tvm/pull/13875) # AutoTVM - [[AutoScheduler][AutoTVM] Enable xgboost >= 1.7.x new changes](https://github.com/apache/tvm/pull/14036) # BugFix - [[BugFix][UMA] Protect target registration](https://github.com/apache/tvm/pull/13624) - [[BugFix][Runtime] Add missing check for `PackedFunc`](https://github.com/apache/tvm/pull/13687) - [[Bugfix][TIR] Fix version conflict with typing for Python 3.8.0](https://github.com/apache/tvm/pull/13744) - [Fix build platform environment variable](https://github.com/apache/tvm/pull/13914) - [[BugFix][TVMScript] Fix the roundtripability of intrinsic pow](https://github.com/apache/tvm/pull/13692) - [[BugFix] Pylance emits the warnning 'Code is unreachable'](https://github.com/apache/tvm/pull/13673) - [[BugFix][TVMScript]fix var capturing order error](https://github.com/apache/tvm/pull/13640) - [[BugFix][TVMScript] Parser crash](https://github.com/apache/tvm/pull/13630) - [[Bugfix][TVMScript] Handle LetStmt for `var1 = var2` expressions](https://github.com/apache/tvm/pull/14320) - [[Bug][CodeGen,Cuda]fix cast fp16 to int8/uint8 in cuda](https://github.com/apache/tvm/pull/13641) - [[fix] MXNet dot for all tensor dimensions](https://github.com/apache/tvm/pull/11760) - [[Bugfix] Conv1Dtranspose default kernel layout should be IOW](https://github.com/apache/tvm/pull/14482) - [[Bugfix] Conv3Dtranspose default kernel layout should be IODHW](https://github.com/apache/tvm/pull/14340) - [[BugFix] Support rewrite_once when the number of callbacks > 1](https://github.com/apache/tvm/pull/14344) - [[Bugfix][TIR] Fix version conflict with typing for different Python versions (3.8.0-3.10.0)](https://github.com/apache/tvm/pull/13820) - [Fix out of bound enum conversion](https://github.com/apache/tvm/pull/13967) - [[bugfix] Fix the write buffer scope of `mma_store_impl`](https://github.com/apache/tvm/pull/14174) - [[BugFix][Runtime] Fix Incorrect node information](https://github.com/apache/tvm/pull/13693) # Build - [[Build] Expose missing USE_VERILATOR in cmake](https://github.com/apache/tvm/pull/13676) - [[Build] Fix find_include_path when using TVM python package](https://github.com/apache/tvm/pull/14007) - [[Build] Fix misleading error messages](https://github.com/apache/tvm/pull/13887) - [[Build][Bugfix] Use CMAKE_ prefix for <LANG>_COMPILER_LAUNCHER](https://github.com/apache/tvm/pull/13697) # BYOC - [[BYOC] DNNL C_SRC Fix](https://github.com/apache/tvm/pull/14267) - [[BYOC] Update CUTLASS backend (SIMT support and codegen clean up)](https://github.com/apache/tvm/pull/14056) # CI - [[CI][microTVM] Enable USE_MICRO for mac and windows CI builds](https://github.com/apache/tvm/pull/14393) - [[CI] Pass the 'path' parameter passed to cmake_build to the task_build.py script](https://github.com/apache/tvm/pull/13905) - [[CI][EZ] Upgrade CI Lint Image](https://github.com/apache/tvm/pull/14373) - [[CI][Lint] Update black](https://github.com/apache/tvm/pull/14346) - [[CI][Flaky] Skip zephyr_qemu-x86 tests that are part of task_python_microTVM](https://github.com/apache/tvm/pull/14005) - [[CI] Fix for NNPack error due to misalignment with pthreadpool library](https://github.com/apache/tvm/pull/13940) - [[ci] Disable Windows-Static-Runtime](https://github.com/apache/tvm/pull/13951) - [[ci][docker] Make branch names valid before using them as tags](https://github.com/apache/tvm/pull/13738) - [[CI] Cross-compile libtvm_runtime to Aarch64 and run tests](https://github.com/apache/tvm/pull/13714) - [[CI] Include static builds of the runtime as part of CI](https://github.com/apache/tvm/pull/13612) - [[CI] Update rerun list for tvm-bot](https://github.com/apache/tvm/pull/13817) - [[CI] Update ci_minimal docker image to cross-compile TVM to aarch64](https://github.com/apache/tvm/pull/13776) - [[CI] Update ci_arm docker image to have LLVM 15](https://github.com/apache/tvm/pull/14296) - [[CI] Update Compute Library to v22.11](https://github.com/apache/tvm/pull/14084) - [[CI] Fix broken model link](https://github.com/apache/tvm/pull/14458) - [[CI][ETHOSN] Add ssh to the driver stack installation](https://github.com/apache/tvm/pull/14246) - [[CI] Fix android build by constraining numpy version](https://github.com/apache/tvm/pull/13648) - [[CI] NNPACK build issue workaround](https://github.com/apache/tvm/pull/13873) - [[CI] Update GPU image for CUDA 11.7](https://github.com/apache/tvm/pull/14363) - [[CI] Update CUDA to 11.7](https://github.com/apache/tvm/pull/14293) - [[CI] Update cpu and gpu image](https://github.com/apache/tvm/pull/14245) - [[CI] Enable USE_MICRO in minimal cross ISA build](https://github.com/apache/tvm/pull/13942) - [[CI][microTVM]Update ci_cortexm image](https://github.com/apache/tvm/pull/13764) - [[CI][Docker][Cortex-M]Update scripts to update ci_cortexm to Ubuntu 20.04](https://github.com/apache/tvm/pull/13736) - [[CI] Fix MLF input and output name map](https://github.com/apache/tvm/pull/13740) - [[CI] Pin sccache version to 0.3.3](https://github.com/apache/tvm/pull/14530) - [[CI] Add llvm-15 and mlir-15 to Docker setup](https://github.com/apache/tvm/pull/14303) - [[CI] Add onnx dependency to test_auto_tensorize.py::test_vnni_bert_int8](https://github.com/apache/tvm/pull/14102) - [[CI] Fix test skipping pytest attribute](https://github.com/apache/tvm/pull/14064) - [[skip ci][ci][docker] Add cross compilation libs](https://github.com/apache/tvm/pull/13800) # Tests - [[Tests] Replace pytest.main with tvm.testing.main](https://github.com/apache/tvm/pull/13717) - [[TESTING] Enable execution of test_packed_8x8x32_resnet50](https://github.com/apache/tvm/pull/13799) - [[testing] Use tuples for numpy indexing](https://github.com/apache/tvm/pull/14476) - [[testing][py_converter] Enhance py_converter to better support entire modules](https://github.com/apache/tvm/pull/13769) - [[Unittest] merge test_cp_async_in_if_then_else into test_tir_transform_inject_ptx_async_copy](https://github.com/apache/tvm/pull/14138) - [[UnitTest] Parametrized test_arith_iter_affine_map::test_padding](https://github.com/apache/tvm/pull/13774) # Docker - [[Docker] Update ci-cpu and ci-arm to tag 20230223-070143-a3b51f11b](https://github.com/apache/tvm/pull/14116) - [[docker][microTVM]Fix Zephyr 0.15.2 SDK installation and separate Zephyr python environment](https://github.com/apache/tvm/pull/13829) - [[docker][microTVM]Update zephyr version to 3.2 and Zephyr SDK to 0.15.2](https://github.com/apache/tvm/pull/13806) - [[Docker]Add dialout group by default on login](https://github.com/apache/tvm/pull/13810) - [[Docker] Add script to build llvm from source](https://github.com/apache/tvm/pull/13823) - [[DOCKER] Configurable NDK version support](https://github.com/apache/tvm/pull/14000) - [[Docker update] Update ci_cpu tag to the latest from tlcpackstaging](https://github.com/apache/tvm/pull/13748) # Docs - [[Doc] fix doc for tvm.te.const()](https://github.com/apache/tvm/pull/13904) - [Add v0.11.0 docs link to site](https://github.com/apache/tvm/pull/14181) - [[docs] Remove empty code blocks](https://github.com/apache/tvm/pull/13689) - [[docs] Add details about patch releases](https://github.com/apache/tvm/pull/14301) - [[Docs] Update listed tvmc python dependencies](https://github.com/apache/tvm/pull/14341) - ["[docs] Add ""Open with Colab"" button to documentation"](https://github.com/apache/tvm/pull/13627) - [[Docs] Add `typing-extensions` dependency guide](https://github.com/apache/tvm/pull/13730) - [[Docs] Fix MetaSchedule Docs](https://github.com/apache/tvm/pull/14480) - [[FIX] Fix Typos in Docs and Comments](https://github.com/apache/tvm/pull/13793) - [[HotFix][docs] Use correct Colab button URL](https://github.com/apache/tvm/pull/13725) # Frontend - TensorFlow & TFLite - [[Frontend][Tensorflow] Update Select to SelectV2](https://github.com/apache/tvm/pull/13884) - [[Frontend][TFLite] Fix conv2d import bug](https://github.com/apache/tvm/pull/14124) - [[TFLite] Support for BATCH_MATMUL tflite operator](https://github.com/apache/tvm/pull/14423) - Pytorch - [[Pytorch] frontend full_impl fix](https://github.com/apache/tvm/pull/14122) - [[PyTorch] Fix in matmul function that enables working with all sizes](https://github.com/apache/tvm/pull/13927) - [[Pytorch][Relay] aten::_weight_norm implementation](https://github.com/apache/tvm/pull/13661) - [[Torch] Added tests in test_forward_linear](https://github.com/apache/tvm/pull/13937) - [[Torch] Fix advanced indexing with NoneType index arguments](https://github.com/apache/tvm/pull/13826) - [[TORCH] scatter_reduce implementation](https://github.com/apache/tvm/pull/14018) - ONNX - [[Frontend] Add ONNX importer for QLinearSoftmax](https://github.com/apache/tvm/pull/14425) - [[ONNX] QGemm support](https://github.com/apache/tvm/pull/13747) - [[ONNX][TOPI] Add `DFT` operator](https://github.com/apache/tvm/pull/13999) - [[Frontend] [ONNX] Support sequence_lens of GRU](https://github.com/apache/tvm/pull/13587) - [[ONNX] Extend converter for Attention from Microsoft onnxruntime contrib opset](https://github.com/apache/tvm/pull/13797) - [[ONNX] Add converter for QAttention from Microsoft onnxruntime contrib opset](https://github.com/apache/tvm/pull/13654) - [[ONNX][TORCH] Replace scatter op by scatter_elements](https://github.com/apache/tvm/pull/14019) - [[ONNX] Support ScatterElements with reduction](https://github.com/apache/tvm/pull/13894) - [[ONNX] Support Bitwise operations](https://github.com/apache/tvm/pull/13888) - [[ONNX] Support Bernoulli op on ONNX front-end](https://github.com/apache/tvm/pull/13802) - [[ONNX] Extend reduction types supported by ScatterND](https://github.com/apache/tvm/pull/13946) - [[ONNX] Support SequenceEmpty op](https://github.com/apache/tvm/pull/13866) - [[ONNX] Support SequenceErase op](https://github.com/apache/tvm/pull/13865) - [[ONNX] Support SequenceLength op](https://github.com/apache/tvm/pull/13863) - Keras - [[Keras] Fix importing conv2d_transpose for NHWC layout](https://github.com/apache/tvm/pull/13998) - OneFlow - [[Frontend][Oneflow] Use FLOW_2_STR_DTYPE for dtype](https://github.com/apache/tvm/pull/14454) - Paddle - [[PaddlePaddle Hackathon 4][Frontend][Paddle]add conv3d for paddle frontend](https://github.com/apache/tvm/pull/14290) - [[Frontend][PaddlePaddle] Fix bug in tests for upgrading paddlepaddle to 2.4.2](https://github.com/apache/tvm/pull/14206) - [[Frontend][Paddle]add take_alone_axis and topk converter for paddle frontend](https://github.com/apache/tvm/pull/14170) - [[Frontend][Paddle] Add where_index op and add vm for paddle frontend's unitest](https://github.com/apache/tvm/pull/14099) - [[Frontend][Paddle] Add norm and one_hot_v2 op](https://github.com/apache/tvm/pull/14049) - ["[Frontend][PaddlePaddle] Add topk op and Fix bug](https://github.com/apache/tvm/pull/13701) - [[PaddlePaddle Hackathon 4][Frontend][Paddle]Add tile/mish/stack/unstack/silu/softshrink/where op for paddle frontend](https://github.com/apache/tvm/pull/14160) - [[Frontend][Paddle]fix eye and dist](https://github.com/apache/tvm/pull/14292) - [[PaddlePaddle Hackathon 4][Frontend][Paddle]add grid-sample/gaussian_random/flip/fill_zeros_like/unique for paddle frontend](https://github.com/apache/tvm/pull/14277) - [[PaddlePaddle Hackathon 4][Frontend][Paddle]add thresholded_relu/index_select/eye/linspace/take_alone_axis/dist for paddle frontend](https://github.com/apache/tvm/pull/14172) # microTVM - [[microTVM] Clean-up test_crt.py and add to pylint](https://github.com/apache/tvm/pull/13886) - [[microTVM] Build standalone_crt with cmake instead of makefile](https://github.com/apache/tvm/pull/13600) - [[microTVM] additional refactoring for enabling USE_MICRO in more builds](https://github.com/apache/tvm/pull/13909) - [[microTVM] Fix host-driven AOT memory workspaces](https://github.com/apache/tvm/pull/13807) - [[microTVM] Fix MacOS build with USE_MICRO=ON](https://github.com/apache/tvm/pull/13711) - [[microTVM] Use QNN schedules to give SOTA performance](https://github.com/apache/tvm/pull/13752) - [[microTVM]Fix more security issues with pyproject](https://github.com/apache/tvm/pull/14434) - [[microTVM] Update poetry to fix security issues](https://github.com/apache/tvm/pull/14429) - [[microTVM]Enable TVMC micro with AoT Executor](https://github.com/apache/tvm/pull/14077) - [[microTVM]Add test for MLPerfTiny models](https://github.com/apache/tvm/pull/13978) - [[microTVM][CRT]Move Makefile to CMake to be cross-platform compatible](https://github.com/apache/tvm/pull/14013) - [[microTVM]Refactor crt_config.h header file generation](https://github.com/apache/tvm/pull/13955) - [[microTVM] Refactor required external functions in CRT to platform-template.c](https://github.com/apache/tvm/pull/13885) - [[microTVM] Update Zephyr version and Zephyr SDK version](https://github.com/apache/tvm/pull/13818) - [[microTVM]Refactor test and add skip to current failing tests/boards](https://github.com/apache/tvm/pull/13858) - [[microTVM] Update tutorials](https://github.com/apache/tvm/pull/13845) - [[microTVM] Add tutorial on how to generate MLPerfTiny submissions](https://github.com/apache/tvm/pull/13783) - [[microTVM][Zephyr]Add project files for mlperftiny submission](https://github.com/apache/tvm/pull/13690) - [[microTVM]Add default value to unspecified project options in project API](https://github.com/apache/tvm/pull/13610) - [[microTVM]Add MLPerfTiny test harness](https://github.com/apache/tvm/pull/14309) - [[microTVM] Fix tvmc tutorial](https://github.com/apache/tvm/pull/14076) - [[microTVM][Zephyr] Remove unnecessary use of generate_c_interface_header](https://github.com/apache/tvm/pull/14091) - [[microTVM][CRT]Separate CRT template project from standalone CRT build](https://github.com/apache/tvm/pull/13812) - [[microTVM][Zephyr] Fix flash command for nrfjprog](https://github.com/apache/tvm/pull/13723) - [[microTVM][Zephyr] Fix TVMC test on hardware](https://github.com/apache/tvm/pull/13598) - [[microTVM] Custom IDE Tutorial](https://github.com/apache/tvm/pull/13857) - [[microTVM] tuning on micro targets with meta-schedule](https://github.com/apache/tvm/pull/13514) - [[microTVM] Allow multiple runners in tuning micro models with meta-schedule](https://github.com/apache/tvm/pull/13811) - [[microTVM] Replace arm_nnsupportfunctions.h with arm_acle.h](https://github.com/apache/tvm/pull/13363) # LLVM - [[LLVM] Use DataLayout::getABITypeAlign instead of getABITypeAlignment](https://github.com/apache/tvm/pull/14534) - [[LLVM] Add missing `override` to GetFormat and GetPropertyMask](https://github.com/apache/tvm/pull/14470) - [[LLVM] Add guard for #include <llvm/Transforms/IPO/PassManagerBuilder.h>](https://github.com/apache/tvm/pull/14469) - [[LLVM] Remove call to EmitDebugLocation from AddAliasInfo](https://github.com/apache/tvm/pull/13872) - [[LLVM] Use std::nullopt instead of llvm::None](https://github.com/apache/tvm/pull/13617) - [[LLVM] Fix registerCallbacks API after recent change](https://github.com/apache/tvm/pull/14323) - [[LLVM] Add support to generate llvm.assume](https://github.com/apache/tvm/pull/14294) - [[LLVM] Add support for DeclBufferNode](https://github.com/apache/tvm/pull/14103) - [[LLVM][BugFix] Fix include Triplet.h bug when LLVM version>= 17](https://github.com/apache/tvm/pull/14235) - [[TEST] Fix division by 0 in llvm codegen test](https://github.com/apache/tvm/pull/14232) - [[SVE] Adding codegen tests for SVE](https://github.com/apache/tvm/pull/14239) # MetaSchedule - [[MetaSchedule] Introducing MemHammer](https://github.com/apache/tvm/pull/14164) - [[MetaSchedule] Introduce Async Pipeline in MultiLevelTiling](https://github.com/apache/tvm/pull/14009) - [[MetaSchedule][ARM] Enable ARM CPU intrinsic for MetaSchedule](https://github.com/apache/tvm/pull/14209) - [[MetaSchedule] Use `shared.dyn` for Tensor Core Schedule Rules](https://github.com/apache/tvm/pull/13891) - [[MetaSchedule] add fp16-16-32 TensorCores rule to default settings](https://github.com/apache/tvm/pull/13822) - [[MetaSchedule][Hexagon] Improve vectorization for standalone elementwise op](https://github.com/apache/tvm/pull/14408) - ["[MetaSchedule] Add ""disabled_pass"" option in tuning API"](https://github.com/apache/tvm/pull/13659) - [[MetaSchedule] Fix anchor-block flow with empty design space generator](https://github.com/apache/tvm/pull/14047) - [[Metaschedule] get_top_k should not return not built records](https://github.com/apache/tvm/pull/13824) - [[Metaschedule] Aligning get_top_k logic in MemoryDatabase and JSONDatabase](https://github.com/apache/tvm/pull/13611) - [[MetaSchedule] preseve global_symbol attached to function after applying MS](https://github.com/apache/tvm/pull/14219) - [[MetaSchedule] Fix a typo in MemoryDatabase](https://github.com/apache/tvm/pull/13928) - [[MetaSchedule] Fix for RewriteLayout + AllocateConst when the rank of the rewritten weight doesn't change](https://github.com/apache/tvm/pull/13851) - [[MetaSchedule] Fix tensorcore winograd task extraction](https://github.com/apache/tvm/pull/13625) - [[HotFix][MetaSchedule] Turn off database shash check](https://github.com/apache/tvm/pull/14188) - [[MetaSchedule] MutateTileSize skip single-candidate SampleCategorical](https://github.com/apache/tvm/pull/14072) - [[Metaschedule] EvolutionarySearchNode::State constructor typo fix](https://github.com/apache/tvm/pull/14002) - [[Fix][MetaSchedule] Fix redundant stages in async pipeline for mlt](https://github.com/apache/tvm/pull/14143) - [[Fix][MetaSchedule] RPCRunner timeout when queueing up](https://github.com/apache/tvm/pull/13963) - [[MetaSchedule] Add pass instrument to MetaSchedule api](https://github.com/apache/tvm/pull/13688) - [[MetaSchedule] Tile and pack intermediate output for CUDA TensorCore](https://github.com/apache/tvm/pull/14108) - [[MeteSchedule] Bugfix: Add checks for nullable `run_secs`](https://github.com/apache/tvm/pull/13790) # Misc - [[UX] Make T.prim_func typecheck as staticmethod](https://github.com/apache/tvm/pull/13980) - [[VM][DMLC] Lower memory usage when loading and dumping weights](https://github.com/apache/tvm/pull/13877) - [[APP] Update android_rpc build tools version](https://github.com/apache/tvm/pull/14052) - [[apps][bundle_deploy]Fix bundle build issue](https://github.com/apache/tvm/pull/14315) - [[Diagnostic] Support constructing Diagnostic Error through ObjectRef](https://github.com/apache/tvm/pull/13977) - [[skip ci] Replace magic_wand model with micro_speech](https://github.com/apache/tvm/pull/14414) - [[IR] Enhance IRModule SEqual/SHash to support cross function calls](https://github.com/apache/tvm/pull/14289) - [[Fix]Fix function ObjectPath in IRModule SEqual](https://github.com/apache/tvm/pull/14230) - [Update to v0.12.dev0](https://github.com/apache/tvm/pull/14241) - [Enable C++17 for cmake modules](https://github.com/apache/tvm/pull/13869) - [Remove temporary VTCM workspace APIs](https://github.com/apache/tvm/pull/13681) - [[IR] Platform-independent SHash](https://github.com/apache/tvm/pull/14204) - [Fix numpy version constraint](https://github.com/apache/tvm/pull/13912) - [[Utils] Allow classmethod and staticmethod in TVMDerivedObject](https://github.com/apache/tvm/pull/14249) - [[Git] Ignore python/requirements directory](https://github.com/apache/tvm/pull/13684) - [Enhance the --help message of composite target](https://github.com/apache/tvm/pull/13842) - [Add support for named outputs in MLF archive](https://github.com/apache/tvm/pull/13704) - [Add Name Transforms for Rust style](https://github.com/apache/tvm/pull/13706) - [Refactor test to make it easier for user to understand how tensor_intrin works](https://github.com/apache/tvm/pull/14017) - [Remove tutorials CMSIS dependency when not needed](https://github.com/apache/tvm/pull/13762) - [Add DisallowAsyncStridedMemCopy post processor to rem](https://github.com/apache/tvm/pull/13720) - [Add check for non-contiguous memory access when lowering to async dma](https://github.com/apache/tvm/pull/13613) - [Relay transform for rolling a known pattern into batch_matmul](https://github.com/apache/tvm/pull/14210) - [[Typo] Fix name of iter var type 4](https://github.com/apache/tvm/pull/14436) - [Extend the USE_LIBBACKTRACE option](https://github.com/apache/tvm/pull/13816) - [[Refactor] Move `VarUseDefAnalysis` to header file](https://github.com/apache/tvm/pull/14185) - [Add header files for GraphExecutorDebug](https://github.com/apache/tvm/pull/13694) - [[pytest] Don't return values from test_* functions](https://github.com/apache/tvm/pull/14475) - [[Analysis] Improve error message in VerifyWellFormed](https://github.com/apache/tvm/pull/14389) - [Revert the changes for NNPACK build issue](https://github.com/apache/tvm/pull/13913) - [[Node] Utility methods for ObjectPathPair handling](https://github.com/apache/tvm/pull/14498) - [[Minor] Change file mode 755 -> 644; EOL CRLF -> LF](https://github.com/apache/tvm/pull/13959) - [[FIX] Minor Compilation Warning Fixes](https://github.com/apache/tvm/pull/13794) - [[Contrib][Sort] Faster Top-K Implementation](https://github.com/apache/tvm/pull/13599) - [[COLLAGE] Add more customization to support more targets](https://github.com/apache/tvm/pull/13450) - [[CONTAINER] Struct Hash/Equal and JSON support for ShapeTuple](https://github.com/apache/tvm/pull/13671) - [[VTA] Provide zero-initialization for VTAGenericInsn](https://github.com/apache/tvm/pull/13698) - [[Fix,Roofline] Fix roofline handling of multiple peak flops](https://github.com/apache/tvm/pull/13716) - [[RPC] Add fail-guard for termination time exception](https://github.com/apache/tvm/pull/13651) - [[TOPHUB] use keys as a keyword for searching of existing statistics](https://github.com/apache/tvm/pull/13874) - [[Transform] Use callable() instead of isinstance() for type checking](https://github.com/apache/tvm/pull/14248) - [[TRANSFORM] Fix virtual device annotation issue with BYOC subgraphs](https://github.com/apache/tvm/pull/13325) # Relay - [[Fix][Relay] Fix axis transformation in squeeze shape function](https://github.com/apache/tvm/pull/14135) - [[QNN][Relay][Topi] Add qnn.dense with weight layout](https://github.com/apache/tvm/pull/13854) - [[fix][relay][qnn] Bug fix for 8-bit quantized mul](https://github.com/apache/tvm/pull/14286) - [[Relay][Op] Connect existing arm_cpu schedule to relay strategy for concat](https://github.com/apache/tvm/pull/14270) - [[Relay] Convert negative axes to positive when importing ONNX Unsqueeze](https://github.com/apache/tvm/pull/13846) - [[Relay][Frontend] Span Filling PyTorch](https://github.com/apache/tvm/pull/14050) - [[Relay][Frontend] Span Filling ONNX](https://github.com/apache/tvm/pull/13767) - [[Relay][Frontend] Span Filling TensorFlow 1](https://github.com/apache/tvm/pull/13728) - [[Relay][Frontend] Span Filling TFLite](https://github.com/apache/tvm/pull/13727) - [[Relay][Frontend] Span filling common API](https://github.com/apache/tvm/pull/13402) - [[Relay][Pass] Separate out the graph partitioning code from fuse_ops.cc](https://github.com/apache/tvm/pull/13964) - [[Relay] Remove overwriting of matmul shapes when they are static](https://github.com/apache/tvm/pull/13615) - [[Relay][Frontend][Onnx] SequenceAt and SplitToSequence Operators](https://github.com/apache/tvm/pull/13602) - [[Relay] Move pad value extraction past null pointer check](https://github.com/apache/tvm/pull/14445) - [[relay][frontend][pytorch]Fix a bug in the _get_pytorch_value_type function](https://github.com/apache/tvm/pull/14421) - [[Relay] Enhance EliminateCommonSubexpr to support Tuple argument](https://github.com/apache/tvm/pull/14169) - [[Relay][TIR] Add utility to lower Relay func to TIR prim func](https://github.com/apache/tvm/pull/13606) - ["[Relay] Check if the attribute ""name"" exists before accessing it"](https://github.com/apache/tvm/pull/14485) - [[Relay][Docs] Fixed examples in relay/transform.py documentation](https://github.com/apache/tvm/pull/13682) - [[Relay][Runtime] Add `set_input/output_zero_copy` in python](https://github.com/apache/tvm/pull/13623) - [[Relay][Testing][Bugfix] `py_converter` should use correct AST for versions above 3.8 too](https://github.com/apache/tvm/pull/13635) - [[relay] preserve the order of input_info of pytorch](https://github.com/apache/tvm/pull/14462) - [[QNN] Change in Pass Context for lookup table calculation](https://github.com/apache/tvm/pull/13660) - [[QNN] Convert fake quantized take to quantized op](https://github.com/apache/tvm/pull/14506) # Schedule - [[Schedule][Bugfix] Fix decompose padding wrt the single child subtree](https://github.com/apache/tvm/pull/13646) - [[Schedule] Add an optional argument `disable_checks` for `Schedule`](https://github.com/apache/tvm/pull/14281) # Target - ["[Target] Make `key=arm_cpu` --> `key=arm_cpu](https://github.com/apache/tvm/pull/13775) - [[Target] Add target tags for Apple Silicon GPU](https://github.com/apache/tvm/pull/14068) - [[Target] Fix Jetson AGX Xavier CPU core count](https://github.com/apache/tvm/pull/14508) - [[Target] Add A10G gpu cuda tag](https://github.com/apache/tvm/pull/14467) # TE - [[TE] Record primitives of Schedule for visualization](https://github.com/apache/tvm/pull/14168) - [[TE][PrimFunc] Fix create primfunc from te extern with explicit buffer load](https://github.com/apache/tvm/pull/13729) # Tensorize - [[Tensorize][runtime] Add support for AMX(Advanced Matrix Extensions) through Tensor intrinsics](https://github.com/apache/tvm/pull/13642) - [[Tensorize][TOPI] Add AMX Tensorizing for int8 batch matmul](https://github.com/apache/tvm/pull/13745) # TIR - [[TensorIR] Support for L2 prefetch async copy and pred_guard enabled async in vectorized if_then_else](https://github.com/apache/tvm/pull/14329) - [[TensorIR][Schedule] New primitive `reorder_block_itervar`](https://github.com/apache/tvm/pull/14448) - [[TensorIR] New schedule primitive `set_dtype`](https://github.com/apache/tvm/pull/14316) - [[Fix][TIR] LowerCrossThreadReduction with write-back predicate](https://github.com/apache/tvm/pull/14199) - [[TIR] Introduce Pass InjectPTXLDG32](https://github.com/apache/tvm/pull/13973) - [[Fix][TIR] Fix tvm::arith::UnionLowerBound](https://github.com/apache/tvm/pull/14304) - [[TIR][Schedule] Add unittest for read_write_at](https://github.com/apache/tvm/pull/14395) - [[TIR] Add cp.async support for tir.if_then_else](https://github.com/apache/tvm/pull/13966) - [[tir] fix buffer_decl buffer allocation](https://github.com/apache/tvm/pull/13906) - [[tir] Add line level debug info](https://github.com/apache/tvm/pull/13012) - [[TIR][FIX] check args size when creating prim_func by runtime::Registry](https://github.com/apache/tvm/pull/13809) - [[TIR] not estimating the flops when there is a default estimated flops as attr](https://github.com/apache/tvm/pull/14379) - [[TIR][Hexagon] Enhancement of NarrowDataType pass for binary ops](https://github.com/apache/tvm/pull/14298) - [[TIR] Handle nullptr returned by FindEntryFunc](https://github.com/apache/tvm/pull/13852) - [[TIR]Fix the crash of the pass RemoveNoOp](https://github.com/apache/tvm/pull/13808) - [[TIR] Update SplitHostDevice to post-process with ConvertSSA](https://github.com/apache/tvm/pull/14496) - [[TIR][Utility] More flexible tir::Substitute arguments](https://github.com/apache/tvm/pull/14251) - [[TIR][Analysis] Implement IdentifyMemCpy analysis function](https://github.com/apache/tvm/pull/13947) - [[TIR] Merged kDeviceThreadAxis and kUseDynamicSharedMemoryTag](https://github.com/apache/tvm/pull/14495) - [[TIR] Improved SeqStmt::Flatten utility](https://github.com/apache/tvm/pull/14497) - [[TIR] Use IRModuleNode::Remove to remove None in PrimFuncPass](https://github.com/apache/tvm/pull/14494) - [[TIR] Use same DataType of builtin::tvm_struct_set in C++ and Python](https://github.com/apache/tvm/pull/14489) - [[TIR] Update LowerTVMBuiltin to use Optional<T>](https://github.com/apache/tvm/pull/14400) - [[TIR] Improved MakePackedAPI error message](https://github.com/apache/tvm/pull/14387) - [[TIR] Legalize dtype of constants in IndexMap](https://github.com/apache/tvm/pull/14385) - [[TIR] Improved error message in InjectSoftwarePipeline](https://github.com/apache/tvm/pull/14391) - [[TIR][Schedule] Allow buffer name argument to Schedule.set_scope](https://github.com/apache/tvm/pull/14327) - [[TIR] Fix dtype mismatch error due to LetStmt](https://github.com/apache/tvm/pull/13710) - [[Fix][TIR] SampleCategorical apply-to-schedule](https://github.com/apache/tvm/pull/14133) - [[TIR][Fix] IndexDataTypeNormalizer not unwrapping float casting](https://github.com/apache/tvm/pull/13789) - [[TIR][Fix] Buffer slicing using index dtype as extent](https://github.com/apache/tvm/pull/13788) - [[TIR] Create Layout with specified axis dtype](https://github.com/apache/tvm/pull/13663) - [[TIR][Schedule] Improve cache_index to cache common subexpressions](https://github.com/apache/tvm/pull/13700) - [[TIR][Arith] Add common sub expr analyzer](https://github.com/apache/tvm/pull/13702) - [[TIR] [Schedule] Add get_output_blocks primitive](https://github.com/apache/tvm/pull/14490) - [[TIR] [Analysis] Expose IsOutputBlock to python](https://github.com/apache/tvm/pull/14352) - [[TIR] [Bugfix] Pass the correct block_sref_reuse to Replace](https://github.com/apache/tvm/pull/14023) - [[TIR] Fix cache_write bug with allocate const node](https://github.com/apache/tvm/pull/13792) - [[TIR][Schedule] Fix reverse_compute_inline](https://github.com/apache/tvm/pull/14263) - [[TIR] Remove special-casing of T.address_of in the storage rewrite pass](https://github.com/apache/tvm/pull/14430) - [[TIR] Refactor BF16Legalize](https://github.com/apache/tvm/pull/14405) - [[TIR] Enhance loop unroll with unroll local access](https://github.com/apache/tvm/pull/14224) - [[TIR] Remove LoadNode and StoreNode](https://github.com/apache/tvm/pull/14381) - [[TIR] Allow TransformLayout index_map to contain RVs](https://github.com/apache/tvm/pull/13930) - [[TIR] Allow TransformLayout with non-inversible index map](https://github.com/apache/tvm/pull/14095) - [[TIR] Fix typo in doc](https://github.com/apache/tvm/pull/14178) - [[TIR] Update block flags and simplify predicate in Reverse-Compute-Inline](https://github.com/apache/tvm/pull/14030) - [[TIR][TOPI][x86][CI] Support skylake avx512](https://github.com/apache/tvm/pull/13621) - [[TIR][TOPI][CI] Fix number of arguments in calls of llvm_pure_intrin](https://github.com/apache/tvm/pull/13881) - [[TIR][Compute-at] Utilize InverseAffineIterMap for dom estimation](https://github.com/apache/tvm/pull/14184) - [[TIR] Expose bitwise ops to python](https://github.com/apache/tvm/pull/13945) - [[TIR] Add merge primitive for TIR schedule](https://github.com/apache/tvm/pull/14398) - [[TensorIR][Primitive] New schedule primitive `reindex_cache_read/write`](https://github.com/apache/tvm/pull/14161) - [[TIR] Fix Datatype in Lower TVM Builtin](https://github.com/apache/tvm/pull/14347) - [[TIR] Enable Host Func Attribute for PrimFunc](https://github.com/apache/tvm/pull/14020) # TOPI - [[FIX][TOPI] Clip with IntImm/FloatImm](https://github.com/apache/tvm/pull/14027) - [[Fix,TOPI] Consolidate generic and x86 scatter nd](https://github.com/apache/tvm/pull/13755) - [[Test][Topi] Avoid depending on f32 rounding behavior for crop_and_divide tests](https://github.com/apache/tvm/pull/13773) - [[TOPI] Expose mem_scope from generic conv2d variants to be more reusable](https://github.com/apache/tvm/pull/13680) - [[TOPI][bugfix] Fix a bug in arm_cpu int8 dotprod schedule and modernize tests](https://github.com/apache/tvm/pull/13669) - [[TOPI] Bugfix arm_cpu schedule_conv2d_spatial_pack_nhwc schedule](https://github.com/apache/tvm/pull/14003) - [[TOPI][OP] Support grouped conv2d_NCHWc](https://github.com/apache/tvm/pull/13733) - [[TOPI] Fix batch_matmul tensorcore legalize for transpose_b = False case](https://github.com/apache/tvm/pull/13618) - [[TOPI] Group normalization](https://github.com/apache/tvm/pull/14193) - [[TOPI] dynamic externsion](https://github.com/apache/tvm/pull/14450) - [[TOPI] Fix tuple unpack in conv2d NCHWc int8](https://github.com/apache/tvm/pull/13761) - [[TOPI] Making test_strided_set require a GPU for testing](https://github.com/apache/tvm/pull/13804) - [[Fix][Relay][TOPI] Bug fix in relay.sum and topi.sum functions](https://github.com/apache/tvm/pull/14285) - ["[TOPI][Fix] Pool must return error if layout is tiled on H](https://github.com/apache/tvm/pull/13975) - [[TOPI] Batch Norm Training Mode](https://github.com/apache/tvm/pull/14190) - [[topi] remove comment redundancy in resize.py](https://github.com/apache/tvm/pull/13860) - [[TOPI][Hexagon] Implement global_avg_pool2d for hexagon](https://github.com/apache/tvm/pull/13614) - [[TOPI] Support non-batch cases for topi.nll_loss](https://github.com/apache/tvm/pull/14060) - [[TOPI] Add instance_norm operator](https://github.com/apache/tvm/pull/14410) - [[TOPI] Support symbolic shape in einsum](https://github.com/apache/tvm/pull/14521) - ["[TOPI][Relay][ONNX] Replace scatter_add by scatter_elements(reduction=""add"")"](https://github.com/apache/tvm/pull/14008) - [[TOPI] Fix data race of batch multibox detection](https://github.com/apache/tvm/pull/14343) - [[TOPI] Fix index dtype in topi strided_slice](https://github.com/apache/tvm/pull/14022) - [[TORCH][TOPI] Support mean reduction for scatter_reduce](https://github.com/apache/tvm/pull/14110) # TVMC - [[TVMC] Fix logging in TVMC](https://github.com/apache/tvm/pull/14175) - [[TVMC] Stop printing a wall of warnings with tvmc tune](https://github.com/apache/tvm/pull/13882) - [[TVMC] Add option to dump TIR code to file](https://github.com/apache/tvm/pull/14186) - [[TVMC] Allow selecting a subset of tasks to be used in `tvmc tune`](https://github.com/apache/tvm/pull/12525) - [[TVMC] Improve --desired-layouts functionality](https://github.com/apache/tvm/pull/14272) - [[TVMC][microNPU] tvmc option for printing which operators are offloaded to Ethos-U](https://github.com/apache/tvm/pull/13212) - [[TVMC][TRANSFORMS] ToMixedPrecision transform support with custom options enabled](https://github.com/apache/tvm/pull/14010) # TVMScript - [[Fix][TVMScript]TVMScript BinOP printing refactor](https://github.com/apache/tvm/pull/14200) - [[TVMScript] Schedule error reporting with new TVMScript printer](https://github.com/apache/tvm/pull/13921) - [[TVMScript] Connect `assert_structural_equal` with new TVMScript printer](https://github.com/apache/tvm/pull/13859) - [[TVMScript] Comments and docstrings printing](https://github.com/apache/tvm/pull/13839) - [[TVMScript] `T.allocate` with `T.decl_buffer` syntax sugar for TVMScript printer](https://github.com/apache/tvm/pull/13813) - [[TVMScript] `T.match_buffer` syntax sugar in arguments for TVMScript printer](https://github.com/apache/tvm/pull/13801) - [[TVMScript] Linter-friendly function definitions](https://github.com/apache/tvm/pull/13713) - [[TVMScript][Fix] Fix `bool` printing for roundtrip](https://github.com/apache/tvm/pull/14390) - [[Fix][TVMScript] Fix `LetStmt` printing logic](https://github.com/apache/tvm/pull/13900) - [[TVMScript] More concise `T.allocate` syntax printing](https://github.com/apache/tvm/pull/13830) - [[TVMScript] Implicit root block syntax sugar for TVMScript printer](https://github.com/apache/tvm/pull/13819) - [[TVMScript] `T.axis.remap` syntax sugar for TVMScript printer](https://github.com/apache/tvm/pull/13743) - [[TVMScript] Robustify the Highlight Printer](https://github.com/apache/tvm/pull/13861) - [[TVMScript] Sugar Var Definition in TIR Buffer](https://github.com/apache/tvm/pull/14223) - [[TVMScript] Distinguish LetStmt and Let expression](https://github.com/apache/tvm/pull/14207) - [[TVMScript] Simplify TIR Var Definition](https://github.com/apache/tvm/pull/13970) - [[TVMScript][UX] Introduce decorator for deprecation](https://github.com/apache/tvm/pull/13941) - [[TVMScript] Support `show_meta`](https://github.com/apache/tvm/pull/13934) - [[TVMScript] Consolidate folder structure](https://github.com/apache/tvm/pull/13841) - [[TVMScript] Default to T.Buffer than T.buffer_decl](https://github.com/apache/tvm/pull/13838) - [[TVMScript] Introduce `PrinterConfig`](https://github.com/apache/tvm/pull/13831) - [[TVMScript] Add ObjectPath to LiteralDoc](https://github.com/apache/tvm/pull/13821) - [[TVMScript] Use TVMScript for all TIR Printing](https://github.com/apache/tvm/pull/13795) - [[TVMScript] Migrate More to TVMScripr Printer](https://github.com/apache/tvm/pull/13785) - [[TVMScript] IR Fragment Printing](https://github.com/apache/tvm/pull/13742) - [[TVMScript] Refactor IRDocsifier](https://github.com/apache/tvm/pull/13593) - [[TVMScript] Remove obsolete modules](https://github.com/apache/tvm/pull/13638) - [[TVMScript] Support SizeVar Roundtripping](https://github.com/apache/tvm/pull/14227) - [[TVMScript] Sugar T.env_thread + T.launch_thread](https://github.com/apache/tvm/pull/14217) - [[TVMScript] Encourage using T.Buffer directly](https://github.com/apache/tvm/pull/13971) - [[TVMScript] Unify `T.handle` and `T.Ptr`](https://github.com/apache/tvm/pull/13969) - [[TVMScript] Enable Safe Autocasting in BufferStore](https://github.com/apache/tvm/pull/13960) - [[TVMScript] Deterministic function ordering](https://github.com/apache/tvm/pull/13962) - [[TVMScript][Fix] Print Multi-line String as Metadata](https://github.com/apache/tvm/pull/13965) - [[TVMScript] Use op attribute to control whether to print dtype in TVMScript](https://github.com/apache/tvm/pull/14111) - [[TVMScript] Upstream IRModule parser from unity](https://github.com/apache/tvm/pull/14487) - [[TVMScript] Upstream IRModule parser from unity](https://github.com/apache/tvm/pull/14487) - [[TVMScript] Upstream IRModule parser from unity](https://github.com/apache/tvm/pull/14487) - [[TVMScript] Improved error message for unexpected top frame](https://github.com/apache/tvm/pull/14399) - [[TVMScript] Use new variable frame in If/Then/Else](https://github.com/apache/tvm/pull/14250) - [[Bugfix][TVMScript] Preserve variable names in LetStmt](https://github.com/apache/tvm/pull/14319) - [[TVMScript] More accurate hints for ImportError](https://github.com/apache/tvm/pull/13662) - [[TVMScript,Fix] Fix findsource when classes are indented](https://github.com/apache/tvm/pull/13924) - [[TVMScript][Printer] Remove relax prefix for now](https://github.com/apache/tvm/pull/14140) - [[Fix][TVMScript] Fix index of metadata in printed script](https://github.com/apache/tvm/pull/14130) - [[TVMScript] Fix print round-tripable multi thread env binding](https://github.com/apache/tvm/pull/13622) - [[TVMScript][Parser] Add more warp-level builtins and `Range`](https://github.com/apache/tvm/pull/14279) -- View it on GitHub: https://github.com/apache/tvm/releases/tag/v0.12.0.rc0 You are receiving this because you are subscribed to this thread. Message ID: <apache/tvm/releases/101808...@github.com>