As we start to build multiple modules, it is useful to start modularizing the
unit-tests with a goal of reducing some of the actual integration tests.
Previously quite a few tests are written in a way that directly invokes end to
end compilation, we also have tests that are coupled with legacy
>
The new branch and tag is now ready.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/15134#issuecomment-1616748355
You are receiving this because you are subscribed to this thread.
Message ID:
It is worth pointing out that:
* Most of the existing tests are CPU-bound, including those uses GPU for
execution (end-to-end tests), which also relies heavily on CPU for code
generation
* All e2e tests can be decoupled as host-side compilation on CPU + execution on
device (e.g. GPUs)
* Brute
Hi @spectrometerHBH, please make a PR on **v0.13.0 branch** like
[this](https://github.com/apache/tvm/pull/14739/files), modify version to
`0.13.0` **on v0.13.0 branch**.
--
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm/issues/15134#issuecomment-1617129797
You
Hi @slyubomirsky @tqchen , can we enable multiple outputs for
`call_tir_inplace`?
We have a use case of fusing rotary embedding and flashattention in MLC-LLM,
the programming interface is:
```
@T.prim_func
def fused_rotary_flashattention(k: T.Buffer(...), q: T.Buffer(...), v:
T.Buffer(...), o