Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)
@masahi Yes, exactly, but it looks like TVM doesn't support it, does it? I'm thinking of a temporary workaround by adding appending all the constraints and axis dependencies as well as a function to verify a config's validity to the ConfigEntity class. Every time when a tuner's `next_batch` function is called, it calls the validation function and only takes those configs that are valid. I think directly making changes on the ConfigEntity class itself to only keep the valid configs might mess up the stuff. Would love to hear your advice! -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650036986
Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)
In this case it is not polyhedral model, but just some constraints on the config space. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650211024
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
General Comments IMHO, @merrymercy's comments on log files are valuable. Many users now look into the log file for the information they need, and even manually modify some logs for experiments or optimizations. This can be achieved because 1) the log files are in text format, and 2) one config (line) in a log file is in a reasonable length. As a result, at high level I agree with @anwang's proposal that keeps the log file in JSON format but uses proto-generated schema to (de)serialize it. IIUC, this approach still allows users to modify the log file manually if needed. On the other hand, one point I have for the current proposal is for `workload`. In terms of the semantic, the `workload` mentioned in the proposal is more like a `task`, as it has `task_name` and `args`. A workload should be a list of input tensors which is independent to tasks. Here is a complete example of conv2d task: ``` "task": { "task_name": "conv2d_NCHWc.x86", "args": [{"tensor": {"name": "data","shape": [1,3,224,224],"dtype": "float32"}}, {"tensor": {"name": "weight","shape": [32,3,3,3],"dtype": "float32"}}, [1, 1], [1, 1, 1, 1], [1, 1], "NCHW", "NCHW", "float32"] }, ``` In addition, one problem is that `args` is just a list of task arguments, so it's hard for people to understand the actual meaning. I'd be great if we could also improve the task initialization process to take keyword arguments instead of position arguments. As a result, we could have: ``` "task": { "task_name": "conv2d_NCHWc.x86", "args": {"data": {"tensor": {"name": "data","shape": [1,3,224,224],"dtype": "float32"}}, "weight": {"tensor": {"name": "weight","shape": [32,3,3,3],"dtype": "float32"}}, "strides": [1, 1], "pooling": [1, 1, 1, 1], "dilation": [1, 1], "data_layout": "NCHW", "output_layout": "NCHW", "dtype": "float32"} }, ``` Ansor's Log Format As @merrymercy mentioned, since Ansor is targeting to a subgraph instead of a single operator, the `task_name` would be an issue. The current approach using hashed subgraph is definitely not user friendly, and we cannot re-establish the subgraph by interpreting its hash value. A better solution would be providing a utility to serialize compute DAG as a string, and another utility to deserialize the string back to the compute DAG. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/20) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/d10feb0e8f50996a70c9e24ab365f1af4e98dd278641f8eeecdac82a2be3cf6c).
Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)
Polyhedral analysis would be an approach to generate the constraints in this scenario. On the other hand, the runtime validation sounds not a general solution, because it might affect the tuner. For example, throwing invalid configs in `next_batch` would result in no measurement results for those records, which means the learning based tuner won't get the feedback of invalid configs. I would prefer either of the following: 1. Propose a new config space representation to support non-grid config space. 2. Let verify passes pluggable. Currently, we have `VerifyGPU` pass that traverses TIR to estimate the memory usage and rejects invalid configs before sending them for compilation. Since this is at the evaluation stage, the rejected configs will still appear at the log file with proper error code, so that the tuner can benefit from it. We can make this mechanism as a callback so that users can bring their own verifier. The problem is that the verifier does not have config space information but just a graph in TIR, so it might be more difficult to check if it's valid or not. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650312544
[apache/incubator-tvm] Pre-release v0.6.1.rc0 - Apache TVM (incubating) v0.6.1.rc0
# Bug Fixes * Fixed process termination routine in windows #4844 * [Runtime] Fix NDArray SaveDLTensor declaration and implementation signature different #4586 * [NODE][Serialization]fix serialization precision loss in float #4503 * [Relay][Frontend][TF] fix _parse_param bug #4711 * Fix bias_add gradient #4516 * Make sure to visit the arguments of inlined functions #4783 * Fix Python syntax error in start_rpc_server_to_tracker.py #4682 * [Bugfix] Fixed crash caused by reversing bitwise operations #4852 * [Fix][VM] Fix copy constructor #5237 * fix small bug about dense_grad #5695 * [Fix] Fix conv2d alter op for arm cpu #5532 * [Fix] Fix dense x86 schedule #4728 * [Relay][Fix] Fix alter op layout when calling a global var #4454 * [Relay][Pass] Fix lambda lift pass for recursive call #4432 * [BUGFIX] Fix search path for libtvm_topi.so #4467 * [Bugfix] Fix Python debugger segfaults with TVM built with LLVM #5685 * [RUNTIME] Fix compile errors of OpenCL FPGA backend #4492 * [BUGFIX][BACKPORT-0.6][ARITH] Fix FloorMod Simplifier #5509 * Some Windows and MSVC fixes #4569 * [Chisel][VTA] Fix multiple transfer issue in LoadUop module #4442 * [VTA] Fix an issue in updating uop_idx in the TensorGemm module #4694 * [VTA] Fixed a crash issue in TSIM driver #4527 * [VTA] Enable streamlined GEMM execution #4392 * [VTA][Chisel] End-to-end Inference with Chisel VTA #4574 * Added declare of aluBits for TensorAlu #4624 * [Quantization] Fix annotation for multiply op #4458 * LRN only supports 4D tensors, remove it from alter_op_layout #5520 * fix topi.nn.global_pool layout=“NHWC” #4656 * [FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780 * [Runtime] Export GraphRuntime in tvm_runtime.dll #5002 * Fix Base64OutStream portability issue #4668 * [AUTOTVM] Fix a bug in generating the search space #4779 * [Relay][VM] Fix compilation of If-Elses #5040 * [RELAY][FRONTEND][TENSORFLOW] Fix FuseBatchNorm output cast error if need_cast is True #4894 * [Bugfix] fskip of EliminateCommonSubexpr cannot always return false #4620 * [Fix] Add ConstantNode to IsAtomic #5457 * [Fix] Fix RemoveUnusedFunctions pass #4700 * [Realy][fix] Fix alpha_equal bug for attribute check #4897 * [Arith] keep div_mode during floordiv simplify #5922 * [ARITH][BACKPORT-0.6] fix a min/max simplify bug #5761 * [0.6-BACKPORT] Improve robustness of the docs build #5583 -- You are receiving this because you are subscribed to this thread. View it on GitHub: https://github.com/apache/incubator-tvm/releases/tag/v0.6.1.rc0
[apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
Dear TVM community, This is a call for vote to release Apache TVM (incubating) version 0.6.1. This is a maintenance release incorporating important bug fixes. All users of Apache TVM (incubating) 0.6.0 are advised to upgrade. Link to release notes: https://github.com/apache/incubator-tvm/releases/tag/v0.6.1.rc0 Link to release candidate: https://dist.apache.org/repos/dist/dev/incubator/tvm/tvm-v0.6.1-rc0 The vote will be open for at least 72 hours. Everyone is welcomed to vote. Please vote by replying to this thread explicitly. +1 = approve +0 = no opinion -1 = disapprove (provide reason) NOTE: this thread is being mirrored in dev@ -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
> could you be a bit more specific wrt your API design comment? Sorry, I meant to be consistent with the registry API / implementation defined at [src/runtime/registry.cc](https://github.com/apache/incubator-tvm/blob/master/src/runtime/registry.cc). Specifically, I think it's better to implement Registry::Manager for CRT, instead of implementing TVMFuncRegistry, which changed the consistency between the two runtime implementations. In addition, the name for such implement could be `registry.c` to maintain consistency. Maintaining consistency helps existing users of the default runtime to easily switch to CRT for their resource constrained systems. And it'll be easier for future contributors to dive into the implementation. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/5) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/4b7a19ba2219668850f3e20aa468191f7550f38b0e046501f8be23b0038fb911).
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 (binding), I checked - Signatures and hashes good - DISCLAIMER, LICENSE, NOTICE - Signatures and hashes - No unexpected binary files - Code compiles TQ -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650460122
[TVM Discuss] [uTVM] [RFC] Improvements to Automatic Quantization for Bare-Metal
For bare-metal devices, it is desirable (for both space and performance reasons) to have a network that consists entirely of integral data types (most often `int8`). However, the automatic integer quantization mechanism in Relay does not serve this use case for two reasons: 1) Inputs are assumed to be `float32`, so they are quantized at the network's prefix, and outputs are forced into `float32`, so they are dequantized at the network's suffix. 2) The quantization pass is geared towards only the most time-consuming operators (e.g., `conv2d` and `dense`), leaving many others in `float32`. We propose two improvements to automatic integer quantization that address these problems: quantize/dequantize partitioning and expanded operator coverage. ## Quantize/Dequantize Partitioning This feature adds a configuration parameter `partition_conversions` to Relay's [quantize](https://github.com/apache/incubator-tvm/blob/master/python/tvm/relay/quantize/quantize.py#L320) API that specifies whether to partition a quantized module into a module with the following functions: - `quantize_inputs`: convert inputs into the quantized data space - `quantized_main`: run the core network that contains only quantized operators - `dequantize_outputs`: converts outputs into the unquantized data space - `main`: calls `quantize_inputs`, `quantized_main`, and `dequantize_outputs` in succession, resulting in equivalent behavior to a quantized module that has **not** been partitioned. If there are unquantized operators in the core network, an exception is raised. The default value is `False`. As an example of this feature in motion, consider the module below: ```c def @main(%x: Tensor[(1, 4, 16, 16), float32], %w: Tensor[(4, 4, 3, 3), float32]) -> Tensor[(1, 4, 16, 16), float32] { nn.conv2d(%x, %w, padding=[1, 1, 1, 1], channels=4, kernel_size=[3, 3]) } ``` After quantization, we see three distinct sections of the function (input quantization, core `int8` network, and output dequantization), delimited below by the horizontal bars. ```c def @main(%x: Tensor[(1, 4, 16, 16), float32]) -> Tensor[(1, 4, 16, 16), float32] { %0 = multiply(%x, 16f) /* ty=Tensor[(1, 4, 16, 16), float32] */; %1 = round(%0) /* ty=Tensor[(1, 4, 16, 16), float32] */; %2 = clip(%1, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 4, 16, 16), float32] */; %3 = cast(%2, dtype="int8")/* ty=Tensor[(1, 4, 16, 16), int8] */; --- %4 = nn.conv2d( %3, meta[relay.Constant][0], padding=[1, 1, 1, 1], channels=4, kernel_size=[3, 3], out_dtype="int32") /* ty=Tensor[(1, 4, 16, 16), int32] */; %5 = add(%4, meta[relay.Constant][1]) /* ty=Tensor[(1, 4, 16, 16), int32] */; %6 = right_shift(%5, meta[relay.Constant][2]) /* ty=Tensor[(1, 4, 16, 16), int32] */; %7 = clip(%6, a_min=-127f, a_max=127f)/* ty=Tensor[(1, 4, 16, 16), int32] */; %8 = cast(%7, dtype="int8") /* ty=Tensor[(1, 4, 16, 16), int8] */; %9 = annotation.stop_fusion(%8) /* ty=Tensor[(1, 4, 16, 16), int8] */; --- %10 = cast(%9, dtype="float32") /* ty=Tensor[(1, 4, 16, 16), float32] */; multiply(%10, 0.0625f) /* ty=Tensor[(1, 4, 16, 16), float32] */ } ``` If `partition_conversions == True`, then the module above is converted to the module below. ```c def @quantize_inputs(%x: Tensor[(1, 4, 16, 16), float32]) -> (Tensor[(1, 4, 16, 16), int8],) { %0 = multiply(%x, 16f); %1 = round(%0); %2 = clip(%1, a_min=-127f, a_max=127f); (cast(%2, dtype="int8"),) } def @quantized_main(%x: Tensor[(1, 4, 16, 16), int8]) -> Tensor[(1, 4, 16, 16), int8] { %0 = nn.conv2d( %x, meta[relay.Constant][0], padding=[1, 1, 1, 1], channels=4, kernel_size=[3, 3], out_dtype="int8"); %1 = add(%0, meta[relay.Constant][1]); %2 = right_shift(%1, meta[relay.Constant][2]); %3 = clip(%2, a_min=-127f, a_max=127f); %4 = cast(%3, dtype="int8"); annotation.stop_fusion(%4) } def @dequantize_outputs(%x: Tensor[(1, 4, 16, 16), int8]) -> Tensor[(1, 4, 16, 16), float32] %0 = cast(%x, dtype="float32"); multiply(%0, 0.0625f) } def @main(%x: Tensor[(1, 4, 16, 16), float32]) -> Tensor[(1, 4, 16, 16), float32] { let %quantized_inputs = @quantize_inputs(%x); let %quantized_outputs = @quantized_main(%quantized_inputs.0); @dequantize_outputs(%quantized_outputs) } ``` **Note:** This new option won't be very helpful on its own until we've expanded operator coverage, since most networks will include unquantized operators. ### Further Considerations Along with the quantize/dequantize functions, for IoT applications, even once you *have* a purely integral network, quantization gives no hints as to how you should convert from raw sensor data into the
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
Ah gotcha. I think there's one thing there I'm not sure how to solve in a good way: TVM generates TVMBackendPackedCFunc from llvm and c codegens. Because TVMFuncCall RPC call accepts a function handle, we have to provide something there that contains enough data to differentiate between TVMBackendPackedCFunc and PackedFunc. The C++ runtime does this with LibraryModule, which creates a closure. I would say on the embedded side, we should avoid this strategy, since that requires some type of dynamic memory allocation. We could store some module- or function-level bits to differentiate, but then that's also more RAM. I had some discussions with @tqchen about unifying the runtime impls--maybe he can weigh in on this? It seems like we could just unify the runtime impls, and make TVMBackendPackedCFunc the cross-runtime standard (even if we modify its signature). Then runtimes would only need to know how to call that one type of function. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/6) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/deaad5fe1f4751a3b8a48490721acfa1ea0990cd6a97b88ef8565406462a01f9).
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650468556
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 Thierry > On Jun 26, 2020, at 6:16 PM, masahi wrote: > > +1 > > -- > You are receiving this because you are subscribed to this thread. > Reply to this email directly or view it on GitHub: > https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650468556 - To unsubscribe, e-mail: dev-unsubscr...@tvm.apache.org For additional commands, e-mail: dev-h...@tvm.apache.org
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650471485
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650476981
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650478109
Re: [apache/incubator-tvm] [VOTE] Release Apache TVM (incubating) v0.6.1.rc0 (#5939)
+1 -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5939#issuecomment-650484640
Re: [apache/incubator-tvm] [RFC][AutoTVM] Non-square ConfigSpace (#5809)
@tqchen @comaniac Thanks for the comments! The feedback of the invalid configs is something I didn't think of. Actually this is the representation I came up for non-grid config space. If the config space is too large, and there are too many constraints so that the shape of the space is very irregular, it will cost too much memory to store the whole config space during the runtime. Therefore I'm thinking of storing a grid space with the current representation and its constraints as the solution. I wonder if it is possible to throw the error code in `next_batch`? Or the error code can only be thrown by a low-level check? The second plan sounds like an interesting idea. Is it gonna be a new pass as you suggest? My concern is as Ansor comes out the pass might not be of great use in the future. What's your opinion? -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/5809#issuecomment-650502937