Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)

2020-06-26 Thread moderato
@masahi Yes, exactly, but it looks like TVM doesn't support it, does it?

I'm thinking of a temporary workaround by adding appending all the constraints 
and axis dependencies as well as a function to verify a config's validity to 
the ConfigEntity class. Every time when a tuner's `next_batch` function is 
called, it calls the validation function and only takes those configs that are 
valid. I think directly making changes on the ConfigEntity class itself to only 
keep the valid configs might mess up the stuff.

Would love to hear your advice!


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650036986

Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)

2020-06-26 Thread Tianqi Chen
In this case it is not polyhedral model, but just some constraints on the 
config space.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650211024

[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format

2020-06-26 Thread Cody H. Yu via TVM Discuss


 General Comments
IMHO, @merrymercy's comments on log files are valuable. Many users now look 
into the log file for the information they need, and even manually modify some 
logs for experiments or optimizations. This can be achieved because 1) the log 
files are in text format, and 2) one config (line) in a log file is in a 
reasonable length. As a result, at high level I agree with @anwang's proposal 
that keeps the log file in JSON format but uses proto-generated schema to 
(de)serialize it. IIUC, this approach still allows users to modify the log file 
manually if needed.

On the other hand, one point I have for the current proposal is for `workload`. 
In terms of the semantic, the `workload` mentioned in the proposal is more like 
a `task`, as it has `task_name` and `args`. A workload should be a list of 
input tensors which is independent to tasks. Here is a complete example of 
conv2d task:

```
"task": {
  "task_name": "conv2d_NCHWc.x86",
  "args": [{"tensor": {"name": "data","shape": [1,3,224,224],"dtype": 
"float32"}},
   {"tensor": {"name": "weight","shape": [32,3,3,3],"dtype": 
"float32"}},
   [1, 1], [1, 1, 1, 1], [1, 1], "NCHW", "NCHW", "float32"]
}, 
```

In addition, one problem is that `args` is just a list of task arguments, so 
it's hard for people to understand the actual meaning. I'd be great if we could 
also improve the task initialization process to take keyword arguments instead 
of position arguments. As a result, we could have:

```
"task": {
  "task_name": "conv2d_NCHWc.x86",
  "args": {"data": {"tensor": {"name": "data","shape": [1,3,224,224],"dtype": 
"float32"}},
   "weight": {"tensor": {"name": "weight","shape": [32,3,3,3],"dtype": 
"float32"}},
   "strides": [1, 1],
   "pooling": [1, 1, 1, 1],
   "dilation": [1, 1],
   "data_layout": "NCHW",
   "output_layout": "NCHW",
   "dtype": "float32"}
}, 
```

 Ansor's Log Format
As @merrymercy mentioned, since Ansor is targeting to a subgraph instead of a 
single operator, the `task_name` would be an issue. The current approach using 
hashed subgraph is definitely not user friendly, and we cannot re-establish the 
subgraph by interpreting its hash value. A better solution would be providing a 
utility to serialize compute DAG as a string, and another utility to 
deserialize the string back to the compute DAG.





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/20) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/d10feb0e8f50996a70c9e24ab365f1af4e98dd278641f8eeecdac82a2be3cf6c).


Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)

2020-06-26 Thread Cody Yu
Polyhedral analysis would be an approach to generate the constraints in this 
scenario. On the other hand, the runtime validation sounds not a general 
solution, because it might affect the tuner. For example, throwing invalid 
configs in `next_batch` would result in no measurement results for those 
records, which means the learning based tuner won't get the feedback of invalid 
configs. I would prefer either of the following:

1. Propose a new config space representation to support non-grid config space.
2. Let verify passes pluggable. Currently, we have `VerifyGPU` pass that 
traverses TIR to estimate the memory usage and rejects invalid configs before 
sending them for compilation. Since this is at the evaluation stage, the 
rejected configs will still appear at the log file with proper error code, so 
that the tuner can benefit from it. We can make this mechanism as a callback so 
that users can bring their own verifier. The problem is that the verifier does 
not have config space information but just a graph in TIR, so it might be more 
difficult to check if it's valid or not.





-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650312544

[apache/incubator-tvm] Pre-release v0.6.1.rc0 - Apache TVM (incubating) v0.6.1.rc0

2020-06-26 Thread Yizhi Liu
# Bug Fixes

* Fixed process termination routine in windows #4844
* [Runtime] Fix NDArray SaveDLTensor declaration and implementation signature 
different #4586
* [NODE][Serialization]fix serialization precision loss in float #4503
* [Relay][Frontend][TF] fix _parse_param bug #4711
* Fix bias_add gradient #4516
* Make sure to visit the arguments of inlined functions #4783
* Fix Python syntax error in start_rpc_server_to_tracker.py #4682
* [Bugfix] Fixed crash caused by reversing bitwise operations #4852
* [Fix][VM] Fix copy constructor #5237
* fix small bug about dense_grad #5695
* [Fix] Fix conv2d alter op for arm cpu #5532
* [Fix] Fix dense x86 schedule #4728
* [Relay][Fix] Fix alter op layout when calling a global var #4454
* [Relay][Pass] Fix lambda lift pass for recursive call #4432
* [BUGFIX] Fix search path for libtvm_topi.so #4467
* [Bugfix] Fix Python debugger segfaults with TVM built with LLVM #5685
* [RUNTIME] Fix compile errors of OpenCL FPGA backend #4492
* [BUGFIX][BACKPORT-0.6][ARITH] Fix FloorMod Simplifier #5509
* Some Windows and MSVC fixes #4569
* [Chisel][VTA] Fix multiple transfer issue in LoadUop module #4442
* [VTA] Fix an issue in updating uop_idx in the TensorGemm module #4694
* [VTA] Fixed a crash issue in TSIM driver #4527
* [VTA] Enable streamlined GEMM execution #4392
* [VTA][Chisel] End-to-end Inference with Chisel VTA #4574
* Added declare of aluBits for TensorAlu #4624
* [Quantization] Fix annotation for multiply op #4458
* LRN only supports 4D tensors, remove it from alter_op_layout #5520
* fix topi.nn.global_pool layout=“NHWC” #4656
* [FFI][Windows] Fix hasattr by extracting Python error type from Windows error 
message #4780
* [Runtime] Export GraphRuntime in tvm_runtime.dll #5002
* Fix Base64OutStream portability issue #4668
* [AUTOTVM] Fix a bug in generating the search space #4779
* [Relay][VM] Fix compilation of If-Elses #5040
* [RELAY][FRONTEND][TENSORFLOW] Fix FuseBatchNorm output cast error if 
need_cast is True #4894
* [Bugfix] fskip of EliminateCommonSubexpr cannot always return false #4620
* [Fix] Add ConstantNode to IsAtomic #5457
* [Fix] Fix RemoveUnusedFunctions pass #4700
* [Realy][fix] Fix alpha_equal bug for attribute check #4897
* [Arith] keep div_mode during floordiv simplify #5922
* [ARITH][BACKPORT-0.6] fix a min/max simplify bug #5761
* [0.6-BACKPORT] Improve robustness of the docs build #5583

-- 
You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://github.com/apache/incubator-tvm/releases/tag/v0.6.1.rc0

[apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Yizhi Liu
Dear TVM community,

This is a call for vote to release Apache TVM (incubating) version 0.6.1. This 
is a maintenance release incorporating important bug fixes. All users of Apache 
TVM (incubating) 0.6.0 are advised to upgrade.

Link to release notes:
https://github.com/apache/incubator-tvm/releases/tag/v0.6.1.rc0

Link to release candidate:
https://dist.apache.org/repos/dist/dev/incubator/tvm/tvm-v0.6.1-rc0

The vote will be open for at least 72 hours. Everyone is welcomed to vote. 
Please vote by replying to this thread explicitly.

+1 = approve
+0 = no opinion
-1 = disapprove (provide reason)

NOTE: this thread is being mirrored in dev@

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939

[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support

2020-06-26 Thread Liangfu Chen via TVM Discuss


> could you be a bit more specific wrt your API design comment?

Sorry, I meant to be consistent with the registry API / implementation defined 
at 
[src/runtime/registry.cc](https://github.com/apache/incubator-tvm/blob/master/src/runtime/registry.cc).
 Specifically, I think it's better to implement Registry::Manager for CRT, 
instead of implementing TVMFuncRegistry, which changed the consistency between 
the two runtime implementations. In addition, the name for such implement could 
be `registry.c` to maintain consistency.

Maintaining consistency helps existing users of the default runtime to easily 
switch to CRT for their resource constrained systems. And it'll be easier for 
future contributors to dive into the implementation.





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/5) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/4b7a19ba2219668850f3e20aa468191f7550f38b0e046501f8be23b0038fb911).


Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Tianqi Chen
+1 (binding),  I checked

- Signatures and hashes good
- DISCLAIMER, LICENSE, NOTICE
- Signatures and hashes
- No unexpected binary files
- Code compiles

TQ

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650460122

[TVM Discuss] [uTVM] [RFC] Improvements to Automatic Quantization for Bare-Metal

2020-06-26 Thread Logan Weber via TVM Discuss


For bare-metal devices, it is desirable (for both space and performance 
reasons) to have a network that consists entirely of integral data types (most 
often `int8`).  However, the automatic integer quantization mechanism in Relay 
does not serve this use case for two reasons:
1) Inputs are assumed to be `float32`, so they are quantized at the network's 
prefix, and outputs are forced into `float32`, so they are dequantized at the 
network's suffix.
2) The quantization pass is geared towards only the most time-consuming 
operators (e.g., `conv2d` and `dense`), leaving many others in `float32`.

We propose two improvements to automatic integer quantization that address 
these problems: quantize/dequantize partitioning and expanded operator coverage.

## Quantize/Dequantize Partitioning

This feature adds a configuration parameter `partition_conversions` to Relay's 
[quantize](https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/quantize/quantize.py#L320)
 API that specifies whether to partition a quantized module into a module with 
the following functions:

- `quantize_inputs`: convert inputs into the quantized data space
- `quantized_main`: run the core network that contains only quantized operators
- `dequantize_outputs`: converts outputs into the unquantized data space
- `main`: calls `quantize_inputs`, `quantized_main`, and `dequantize_outputs` 
in succession, resulting in equivalent behavior to a quantized module that has 
**not** been partitioned.

If there are unquantized operators in the core network, an exception is raised. 
 The default value is `False`.

As an example of this feature in motion, consider the module below:

```c
def @main(%x: Tensor[(1, 4, 16, 16), float32], %w: Tensor[(4, 4, 3, 3), 
float32]) -> Tensor[(1, 4, 16, 16), float32] {
  nn.conv2d(%x, %w, padding=[1, 1, 1, 1], channels=4, kernel_size=[3, 3])
}
```

After quantization, we see three distinct sections of the function (input 
quantization, core `int8` network, and output dequantization), delimited below 
by the horizontal bars.

```c
def @main(%x: Tensor[(1, 4, 16, 16), float32]) -> Tensor[(1, 4, 16, 16), 
float32] {
  %0 = multiply(%x, 16f) /* ty=Tensor[(1, 4, 16, 16), float32] 
*/;
  %1 = round(%0) /* ty=Tensor[(1, 4, 16, 16), float32] 
*/;
  %2 = clip(%1, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 4, 16, 16), float32] 
*/;
  %3 = cast(%2, dtype="int8")/* ty=Tensor[(1, 4, 16, 16), int8] */;
---
  %4 = nn.conv2d(
%3,
meta[relay.Constant][0],
padding=[1, 1, 1, 1],
channels=4,
kernel_size=[3, 3],
out_dtype="int32")  /* ty=Tensor[(1, 4, 16, 16), 
int32] */;
  %5 = add(%4, meta[relay.Constant][1]) /* ty=Tensor[(1, 4, 16, 16), 
int32] */;
  %6 = right_shift(%5, meta[relay.Constant][2]) /* ty=Tensor[(1, 4, 16, 16), 
int32] */;
  %7 = clip(%6, a_min=-127f, a_max=127f)/* ty=Tensor[(1, 4, 16, 16), 
int32] */;
  %8 = cast(%7, dtype="int8")   /* ty=Tensor[(1, 4, 16, 16), 
int8]  */;
  %9 = annotation.stop_fusion(%8)   /* ty=Tensor[(1, 4, 16, 16), 
int8]  */;
---
  %10 = cast(%9, dtype="float32") /* ty=Tensor[(1, 4, 16, 16), float32] */;
  multiply(%10, 0.0625f)  /* ty=Tensor[(1, 4, 16, 16), float32] */
}
```

If `partition_conversions == True`, then the module above is converted to the 
module below.

```c
def @quantize_inputs(%x: Tensor[(1, 4, 16, 16), float32]) -> (Tensor[(1, 4, 16, 
16), int8],) {
  %0 = multiply(%x, 16f);
  %1 = round(%0);
  %2 = clip(%1, a_min=-127f, a_max=127f);
  (cast(%2, dtype="int8"),)
}

def @quantized_main(%x: Tensor[(1, 4, 16, 16), int8]) -> Tensor[(1, 4, 16, 16), 
int8] {
  %0 = nn.conv2d(
%x,
meta[relay.Constant][0],
padding=[1, 1, 1, 1],
channels=4,
kernel_size=[3, 3],
out_dtype="int8");
  %1 = add(%0, meta[relay.Constant][1]);
  %2 = right_shift(%1, meta[relay.Constant][2]);
  %3 = clip(%2, a_min=-127f, a_max=127f);
  %4 = cast(%3, dtype="int8");
  annotation.stop_fusion(%4)
}

def @dequantize_outputs(%x: Tensor[(1, 4, 16, 16), int8]) -> Tensor[(1, 4, 16, 
16), float32]
  %0 = cast(%x, dtype="float32");
  multiply(%0, 0.0625f)
}

def @main(%x: Tensor[(1, 4, 16, 16), float32]) -> Tensor[(1, 4, 16, 16), 
float32] {
  let %quantized_inputs = @quantize_inputs(%x);
  let %quantized_outputs = @quantized_main(%quantized_inputs.0);
  @dequantize_outputs(%quantized_outputs)
}
```

**Note:** This new option won't be very helpful on its own until we've expanded 
operator coverage, since most networks will include unquantized operators.

### Further Considerations
Along with the quantize/dequantize functions, for IoT applications, even once 
you *have* a purely integral network, quantization gives no hints as to how you 
should convert from raw sensor data into the

[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support

2020-06-26 Thread Andrew Reusch via TVM Discuss


Ah gotcha. I think there's one thing there I'm not sure how to solve in a good 
way: TVM generates TVMBackendPackedCFunc from llvm and c codegens. Because 
TVMFuncCall RPC call accepts a function handle, we have to provide something 
there that contains enough data to differentiate between TVMBackendPackedCFunc 
and PackedFunc. The C++ runtime does this with LibraryModule, which creates a 
closure. I would say on the embedded side, we should avoid this strategy, since 
that requires some type of dynamic memory allocation.

We could store some module- or function-level bits to differentiate, but then 
that's also more RAM. I had some discussions with @tqchen about unifying the 
runtime impls--maybe he can weigh in on this? It seems like we could just unify 
the runtime impls, and make TVMBackendPackedCFunc the cross-runtime standard 
(even if we modify its signature). Then runtimes would only need to know how to 
call that one type of function.





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/6) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/deaad5fe1f4751a3b8a48490721acfa1ea0990cd6a97b88ef8565406462a01f9).


Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread masahi
+1

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650468556

Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Thierry Moreau
+1

Thierry

> On Jun 26, 2020, at 6:16 PM, masahi  wrote:
> 
> +1
> 
> -- 
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly or view it on GitHub:
> https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650468556


-
To unsubscribe, e-mail: dev-unsubscr...@tvm.apache.org
For additional commands, e-mail: dev-h...@tvm.apache.org



Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread ziheng
+1

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650471485

Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Junru Shao
+1

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650476981

Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Siju Samuel
+1

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650478109

Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)

2020-06-26 Thread Jared Roesch
+1 

-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650484640

Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)

2020-06-26 Thread moderato
@tqchen @comaniac Thanks for the comments! The feedback of the invalid configs 
is something I didn't think of. Actually this is the representation I came up 
for non-grid config space. If the config space is too large, and there are too 
many constraints so that the shape of the space is very irregular, it will cost 
too much memory to store the whole config space during the runtime. Therefore 
I'm thinking of storing a grid space with the current representation and its 
constraints as the solution. I wonder if it is possible to throw the error code 
in `next_batch`? Or the error code can only be thrown by a low-level check?

The second plan sounds like an interesting idea. Is it gonna be a new pass as 
you suggest? My concern is as Ansor comes out the pass might not be of great 
use in the future. What's your opinion?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650502937