This is an automated email from the ASF dual-hosted git repository.

spectrometerHBH pushed a change to branch tir-bench
in repository https://gitbox.apache.org/repos/asf/tvm.git


      at cd8790bc54 feat(infra): util-gated GPU selection for tir-bench + 
per-config warmup/repeat

This branch includes the following new commits:

     new 52ecb035ec tirx
     new f7f287f88f feat(op): add permute_layout primitive; remove permute_dims 
(#629)
     new 6053f82fcc feat(op-dispatch): add warp ldmatrix/stmatrix dispatch for 
Tx.copy (#630)
     new 7addfb1da2 refactor(op-dispatch): rename copy/warp_matrix.py to 
copy/ld_stmatrix.py (#631)
     new 7ea22041de test(op-dispatch): add wg-scope both-sides-permuted 
invariance test (#632) (#633)
     new d4b9e91923 refactor(codegen): remove tirx.entry_cluster_sync codegen 
attribute (#634)
     new 36b274d738 feat(tvmscript): add @Tx.jit decorator, Tx.constexpr 
params, Tx.wg_reg_tile (#635)
     new bb15ecc80a refactor(lower-tirx): replace ScopeKind::kKernel with 
Tx.device_entry() marker (#636)
     new bb57b65fd8 fix(tirx/stmt_functor): add ScopeIdDefStmt to Python 
StmtFunctor dispatch (#637)
     new b148ee5f5f refactor(lower-tirx): drop Tx.filter wrapper for canonical 
thread filters (#638)
     new bad4d0a932 feat(tirx): add typed pointer byte-offset intrinsic (#641)
     new a394fd58b0 docs: update tir bench baseline results (#642)
     new aff1a6ce34 refactor(op-dispatch): split CUDA copy into reg + gmem_smem 
+ ldgsts (#640)
     new e3271628f4 feat(tirx): add .16x{64,128,256}b tcgen05.ld/st dispatch + 
factory (#644)
     new 3432c72a50 refactor(op-dispatch): ewise broadcast at layout level + 
copy vec-alignment fix (#645)
     new ce87b82cc4 feat(tirx): add M=128 dispatch + layout for .16x*b 
tcgen05.ld/st (#646)
     new 714a795729 revert(submodules): undo accidental 3rdparty pointer bumps 
in #646 (#647)
     new aef7bd7e97 feat(gemm_async): accept Layout F C operand for M=64 MMAs 
(#648)
     new 010246e61f feat(infra): add tir-bench slash command for pre-commit 
regression check (#650)
     new f715671412 fix(arith): gate canonical-simplify LT Case 2 on extra 
scale == +1 (#651)
     new aa035ff69f fix(arith): memoize IntervalSet variable relaxation to 
avoid exponential blowup (#652)
     new cd8790bc54 feat(infra): util-gated GPU selection for tir-bench + 
per-config warmup/repeat

The 22 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Reply via email to