Hi Animesh,
The problem is that I need padding added in the middle of TIR on my
(transformed) data tensor.
I.e., something like
```
A1 = im2col(A)
A2 = pad(A1)
C_padded = te.compute([M,N], lambda i, j : sum(A2[i,k]*B[k,j], k)
C = unpad(C)+requantization
```
Then I tile on `C` and tensorize o
How about using Relay Legalize pass to add an explicit padding at the graph
level?
---
[Visit
Topic](https://discuss.tvm.ai/t/loop-partitioning-padding-and-tensorization/7753/2)
to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [
"devices" sounds much better than "accelerators"
---
[Visit Topic](https://discuss.tvm.ai/t/rfc-composite-target/7744/8) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe/f03287
Thanks for the suggestion. "devices" sounds good to me.
---
[Visit Topic](https://discuss.tvm.ai/t/rfc-composite-target/7744/7) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://discuss.tvm.ai/email/unsubscribe
i see, i am just debating whether accelerators is the right name. perhaps
devices?
---
[Visit Topic](https://discuss.tvm.ai/t/rfc-composite-target/7744/6) to respond.
You are receiving this because you enabled mailing list mode.
To unsubscribe from these emails, [click
here](https://disc
Hi all,
In my effort to accelerate AArch64 through tensorization, I incurred into an
issue.
Basically, I am padding my input tensor, to let `tensorize` work (I need rows
to be multiple of 4 and cols to be multiple of 16).
However, bound inference removes padding (since it is not used) and
[quote="tqchen, post:3, topic:7744, full:true"]
I agree P2 is better. However, we need to be mindful that the composite can go
beyond single accelerator settings. For example, we might also want to compose
`arm_cl` and opencl on ARM GPU
[/quote]
Since `accelerators` is an array, we can specify
I have also tried bypassing this issue by passing the three tensor objects
inside a sparse array.
```python
from tvm.contrib import sparse
# create placeholder tensors
...
n = out_c
k = kdim_h * kdim_w * in_c
sparse_weights = sparse.placeholder((n,k), nonzeros=(1-sparsity)*n*k, name='W')
#
I believe using this needs cmake 3.12 or later because of the use of
FindPython3 in your cmake modules and this would require an update to the
install source documentation as that implies a requirement of cmake > 3.5 for
building tvm.
---
[Visit Topic](https://discuss.tvm.ai/t/add-the-do
[quote="leandron, post:52, topic:6844"]
I’m interested to understand how would the JSON would look like for case 3
(above). Also, could you expand a little on how you see this working,
specifically who would oversee the graph partitioning process (and call the
expected passes in the expected o
[quote="zhiics, post:3, topic:7741"]
a `MetadataModule` could contain a DSOModule and one or more
CSourceModule/JSONRuntimeModule. It seems a bit hard to save them out as one
file for compilation though.
[/quote]
I see what you mean. It might be the case, for some modules, they don't offer
an
11 matches
Mail list logo