[TVM Discuss] [Development] VTA First Conv Layer Optimize
HI @hjiang, Sorry for the late response, I've had some other work to do. Thanks for the proposed solutions, I'll try this implementations with my model and I'll keep you updated. Regards Augusto --- [Visit Topic](https://discuss.tvm.ai/t/vta-first-conv-layer-optimize/6766/6) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/e96edd783330fbb6d0b66e48ee4895771289c652c189e36bb0d4d4d8a5dd1cc2).
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
Some comments on the dtype, the dtype field in Tensor is actually quite flexible(goes beyond the enumeration since arbitary vector length, bitwidth and customized data type is also allowed). So perhaps string, or making a structured variant makes sense. So we can continue use string for simplicity and consistency with the python side of the repr, alternatively one could design a further composite encoding, but that will involves parsing printing of the typestr, which could be overkill here. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/15) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/777fc67c34a4ec225eadca133aa56d854d48e02b138de79942f7e17101b1d04a).
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
## Motivation As part of the [Standalone µTVM Roadmap](https://discuss.tvm.ai/t/rfc-tvm-standalone-tvm-roadmap/6987), the TVM RPC server is being implemented on bare metal devices. The overall approach is to link MinRPCServer against the MISRA-C runtime plus additional compiled TVM functions. There are several new features that need to be added to the MISRA-C runtime in order to service all of the RPC requests. 1. `TVMFuncGetGlobal` and `TVMModGetFunction` are currently implemented as hardcoded `if-else` blocks. Some scalable, programmatic solution is needed to interact with both generated TVMModules and PackedCFuncs from third-party code. 2. Since Modules are typically instantiated in the C++ runtime using `dlpack` , some strategy for Module instantiation is needed in the MISRA-C runtime. 3. Unit tests coverage should be improved, and black-box tests need to be written, since the MISRA- C runtime tests will eventually be used as an acceptance test for µTVM devices. This RFC doesn't specifically address testing, but does address some improvements needed to enable tests to be written: 1. There is no standalone build of the MISRA-C runtime currently checked-in; so, it's possible that the MISRA-C runtime could depend on C++ code. 2. There isn't a way to write tests against the MISRA-C runtime without linking it to `libtvm.so`. 4. Some C++ features are being used, such as `exit()` , which may need to be extended in an embedded setting. Also, a logging solution needs to be devised to ease debugging. ## Changes to the MISRA-C Runtime ### FuncRegistry Function lookup by string name is a common RPC task. Currently, the MISRA-C runtime implements function lookup using an `if-else` statement tree. This RFC introduces a new type, `TVMFuncRegistry` , which can be used for both global and module functions. ``` /*! * \brief A data structure that facilitates function lookup by C-string name. */ typedef struct TVMFuncRegistry { /*! \brief Names of registered functions, concatenated together and separated by \0. * An additional \0 is present at the end of the concatenated blob to mark the end. * * Byte 0 is the number of functions in `funcs`. */ const char* names; /*! \brief Function pointers, in the same order as their names in `names`. */ const TVMBackendPackedCFunc* funcs; } TVMFuncRegistry; ``` The design constraints were: 1. Assume a small number of functions (i.e. < 30) 2. All lookup functions should work on a `const` instance of the Registry so it can be placed in flash. 3. Easy to generate in TVM codegen. The `names` field is a `uint8_t` count followed by the C-string names of functions concatenated together. A terminating `\0` marks the end of the `names` list, so when `count > 0` , `names` always ends in a double `\0\0` . ``` Byte: 0 1 2 3 4 5 6 7 8 9 a b c d +---+---+---+---+---+---+---+---+---+---+---+---+---+---+ names: | N | F u n c 0 \0 | F u n c 1 \0| \0| +---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Legend: N - `count` field Func0 - 0th registered function name Func1 - 1st registered function name ``` Function lookup is done by linearly scanning the concatenated names until a match is found. Function handles are encoded as the 0-based index of the matching name in the FuncRegistry. >From the index, the function pointer can be retrieved in constant time. The >index is validated using the `count` field and the pointer is returned by >indexing `funcs` . ### MutableFuncRegistry Unlike the SystemLib Module, the global function namespace can change at runtime. `MutableFuncRegistry` is a RAM-based implementation of `FuncRegistry` that adds a `Set` function. ``` /*! * \brief A TVMFuncRegistry that supports adding and changing the functions. */ typedef struct TVMMutableFuncRegistry { TVMFuncRegistry registry; /*! \brief maximum number of functions in this registry. */ size_t max_functions; } TVMMutableFuncRegistry; ``` Here, `registry` is presumed to be mutable, so `Set` just directly modifies `names` and `funcs` . The create function accepts a single block of memory and partitions it into `names` and the rest of the data structure. It computes a capacity based on an average function name length and stores that in `max_functions` . ### Modules CRT Modules are implemented as structs whose first member is a `TVMModule` : ``` /*! * \brief Module container of TVM. */ typedef struct TVMModule { /*! \brief The function registry associated with this module. */ const TVMFuncRegistry* registry; } TVMModule; ``` Modules can be registered with the CRT (though a public-facing function is yet TBD). Upon registration, a pointer is placed in a global modules array, and the module is assigned id equal to the index in this array. Note that `TVMBackendRegisterSystemLibSymbol` will not be implemented in the CRT C implementation. SystemLibs gener
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
I see. In my experience, it is worth making this a structured type, even if it seems painful at first. In the long run, having to maintain custom parsing logic for just one of your fields (where the others are all structured) ends up being a maintenance burden. I'm a strong advocate for using structured types as they were intended to be used. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/16) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/37584ad1cb07c9dec63885ea3ec1891596a2208cc5b0b81d0091fa7a56e7a974).
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
In this case the parsing is already necessary and builtin, because the numpy convention uses the string for dtype. So we are trying to build compatibility for interpolating with something that already exists. The types on the c++ side is structured. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/17) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/fef712e77b049e64ac9834ab20bd488e7484022aad1c97e4e1e507585c26c8ab).
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
Gotcha. In that case I think it's important to document that the format of the field is the type string used by numpy. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/18) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/a615e7637634e32a6d2e38037bbf6299880a53b3cabcb2708b1625013e3cccbb).
[TVM Discuss] [Development/RFC] [RFC][BYOC] Data Calibration Flow
## Motivation Although TVM provides quantization flow for pre-quantized models, we do find some developers would prefer to use their own quantization flow for their accelerators, since they may have specialized calibration and quantization flows other than TVM QNN. However, current BYOC flow has limited support in this scenario. One current workaround involves two passes of compilation pipelines. In the first pass, we partition the graph and go through a graph runtime to get the calibration data. In the second pass, the calibration results are used along with the BYOC flow to generate the final quantized code for the accelerator. ## Proposal In this RFC, we want to provide a clean and easy-to-use interface for developers to collect calibration data to feed into their calibration and quantization flows. With this interface, they can get the calibration data along with the subgraph information for the final code generation with only a single API. ### Programming Model ```python mod, params = relay.testing.mobilenet.get_workload(...) # passes for generating partitioned graphs mod = transform.AnnotateTarget(["dnnl"])(mod) mod = transform.MergeCompilerRegions()(mod) mod = transform.PartitionGraph()(mod) # proposed calibration flow and API i_data = ... # the input data to be calibrated calib_data = analysis.calibrate_parition_graph(mod, i_data, params) # pass the calibration data to the external codegen and build the program with transform.PassContext(opt_level=3, config={'calib_data': calib_data}): realy.build(mod, ...) ``` We propose a new analysis API ``calibrate_parition_graph`` (any better names would be appreciated) that takes in three inputs: the partitioned module, the input data to be calibrated, and the parameters. It returns the calibration data, which is a mapping between the subgraph name and all its input and output values. Following we show a synthetic example. The Relay graph after partitioning: ```text def @dnnl0(%dnnl0_i0: Tensor[(3, 3), float32], %dnnl0_i1: Tensor[(3, 3), float32]) -> Tensor[(3, 3), float32] { add(%dnnl0_i0, dnnl0_i1) } def @dnnl1(%dnnl0_i0: Tensor[(3, 3), float32], %dnnl0_i1: Tensor[(3, 3), float32]) -> Tensor[(3, 3), float32] { sub(%dnnl0_i0, dnnl0_i1) } def @main(%data0: Tensor[(3, 3), float32], %data1: Tensor[(3, 3), float32], %data2: Tensor[(3, 3), float32]) -> Tensor[(3, 3), float32] { %0 = @dnnl0(%data0, %data1) @dnnl1(%0, %data2) } ``` Then this will be the calibration data we get: ``` {“main”: {“inputs”: [**data0**, **data1**, **data2**], “outputs”: [**output**]}, “dnnl0”: {“inputs”: [**data0**, **data1**], “outputs”: [**%0**]} “dnnl1”: {“intputs”: [**%0**, **data2**], “outputs”: [**output**]}} ``` Note that if we have multiple sets of data to be calibrated, the final results will be a list of list. Finally, to use the calibration data during code generation, we send them to the ``PassContext``. ## Implementation Details We implement two passes to get the calibration results. The first pass will remove all back-end specific attributes and mark all intermediate tensors as the final outputs. Then, we use the graph runtime to get the tensor values. The second pass will get the mapping between the subgraph name and the tensor values. Then, we perform some post-processing to get the final calibration data as shown above. The POC branch is available [here](https://github.com/seanlatias/incubator-tvm/tree/calibrate) cc @zhiics, @comaniac, @masahi, @matt-arm, @tqchen --- [Visit Topic](https://discuss.tvm.ai/t/rfc-byoc-data-calibration-flow/7099/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/7736794e874e25487704516132b71ca5d4a647cd15c748f346efb864a5d9e696).
[TVM Discuss] [Development/RFC] [RFC][BYOC] Data Calibration Flow
Also cc @JoeyChou @abergeron --- [Visit Topic](https://discuss.tvm.ai/t/rfc-byoc-data-calibration-flow/7099/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/e825d1e5ca1d063973bfd2732b9cd804d94738d88eac124920af4cc95abcb26e).
[TVM Discuss] [Development/RFC] [RFC][BYOC] Data Calibration Flow
cc @anijain2305 as well --- [Visit Topic](https://discuss.tvm.ai/t/rfc-byoc-data-calibration-flow/7099/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/da1ed6d19e3b968d5c3954ef9b22f544a17cd0054f96e607bb6697d3373d7c11).
Re: Podling Tvm Report Reminder - July 2020
working on it TQ On Wed, Jun 24, 2020 at 8:38 PM wrote: > Dear podling, > > This email was sent by an automated system on behalf of the Apache > Incubator PMC. It is an initial reminder to give you plenty of time to > prepare your quarterly board report. > > The board meeting is scheduled for Wed, 15 July 2020. > The report for your podling will form a part of the Incubator PMC > report. The Incubator PMC requires your report to be submitted 2 weeks > before the board meeting, to allow sufficient time for review and > submission (Wed, July 01). > > Please submit your report with sufficient time to allow the Incubator > PMC, and subsequently board members to review and digest. Again, the > very latest you should submit your report is 2 weeks prior to the board > meeting. > > Candidate names should not be made public before people are actually > elected, so please do not include the names of potential committers or > PPMC members in your report. > > Thanks, > > The Apache Incubator PMC > > Submitting your Report > > -- > > Your report should contain the following: > > * Your project name > * A brief description of your project, which assumes no knowledge of > the project or necessarily of its field > * A list of the three most important issues to address in the move > towards graduation. > * Any issues that the Incubator PMC or ASF Board might wish/need to be > aware of > * How has the community developed since the last report > * How has the project developed since the last report. > * How does the podling rate their own maturity. > > This should be appended to the Incubator Wiki page at: > > https://cwiki.apache.org/confluence/display/INCUBATOR/July2020 > > Note: This is manually populated. You may need to wait a little before > this page is created from a template. > > Note: The format of the report has changed to use markdown. > > Mentors > --- > > Mentors should review reports for their project(s) and sign them off on > the Incubator wiki page. Signing off reports shows that you are > following the project - projects that are not signed may raise alarms > for the Incubator PMC. > > Incubator PMC > > - > To unsubscribe, e-mail: dev-unsubscr...@tvm.apache.org > For additional commands, e-mail: dev-h...@tvm.apache.org > >
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
cc @liangfu @tgall_foo --- [Visit Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/2) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/e9d12bfb759c4b187898f46eb5e04691d3435a100dafb7c6a3e131b2505d0275).
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
Hi @areusch, thanks for proposing RPC support to MISRA-C runtime. Regarding to the `TVMFuncRegistry` design, the `names` field are designed to be rather compact, and requires a special handling of the list of strings. I would rather propose an alternative way that we could use fixed-length strings, so that we could easily access all strings (with constant time) without linearly scanning from the beginning, and still it's a single block of string. Regarding to the API design, can we make it more consistent with existing API design in the default TVM runtime? The consistency would help us maintain both runtimes with existing TVM header files and APIs. --- [Visit Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/43fbc62ca7780d953b96902ba0c3a94446e055122936498509dc60690f5b8b5e).
[TVM Discuss] [Development/RFC] [RFC] MISRA-C changes for RPC support
hi @liangfu, thanks for taking a look at my proposal! I agree the names field is a little complex. here are some alternatives: N1. use an array of `const char[MAX_FUNC_NAME_LENGTH][num_funcs]`. The positive is that you can traverse the list without scanning; the negative is it could waste space especially for small functions. if that wasted space is >1 word, you may as well consider N2. another negative is that `MAX_FUNC_NAME_LENGTH` may need to be adjusted frequently. N2. use an array of `const char*`, and encode each name as a separate string in flash. the positive is you can traverse the whole list without scanning every name. the negative is adding 1 word per function name. I think here, i'm wondering in what case you wouldn't need to traverse; I think the answer is likely if you sorted the names and implemented binary search. you could do that; i'd argue that it's not that important yet (num funcs is small, and function name lookup is (i don't think) that time-sensitive) and is itself added complexity/code space. however, it is well-understood so the complexity doesn't bother me as much. i'm open to each of these, but I would like to point out that special handling done here is just combining strcmp with strlen. I agree it's still artisanal. could you be a bit more specific wrt your API design comment? Andrew --- [Visit Topic](https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/48d6d82b6e43fc04db79b6fde67e28121a81f21858294182a7f5716966e44b89).
[TVM Discuss] [RFC] Canonicalizing AutoTVM Log Format
## Difference between the logs for Ansor and AutoTVM There are two major differences between ansor's log and autotvm's log 1. The workload for Ansor is a subgraph defined by multiple `tvm.compute`, while the workload for autotvm is a single operator. To index log quickly, Ansor stores a hash value of the subgraph as the workload key. 2. Ansor saves the whole serialized schedule as `config` (in json format), while autotvm only stores the parameters. However, Ansor's new log format can still fit into the @tqchen 's design of top-level fields. ## Other thoughts 1. The current log file is an append-able text file, where one line corresponds to one log item. I can edit it with a text editor. If we use a binary format, I want this property to be preserved. 2. If we make the log longer and more readable, there will be a lot of redundancy in the file. For example, for a single tuning job, the same long target string will appear in all lines. Do we have methods to compress it? --- [Visit Topic](https://discuss.tvm.ai/t/rfc-canonicalizing-autotvm-log-format/7038/19) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/c1321c86b5ca1091ad5aa7c44918ac2e0ba75eace798d6ee38994a124243d2e4).
[TVM Discuss] [Development/RFC] [RFC] Minor (bugfix) Release for v0.6
Here's a list of fixes we applied to v0.6 branch. I will cut a tag this Friday. * Fixed process termination routine in windows #4844 * [Runtime] Fix NDArray SaveDLTensor declaration and implementation signature different #4586 * [NODE][Serialization]fix serialization precision loss in float #4503 * [Relay][Frontend][TF] fix _parse_param bug #4711 * Fix bias_add gradient #4516 * Make sure to visit the arguments of inlined functions #4783 * Fix Python syntax error in start_rpc_server_to_tracker.py #4682 * [Bugfix] Fixed crash caused by reversing bitwise operations #4852 * [Fix][VM] Fix copy constructor #5237 * fix small bug about dense_grad #5695 * [Fix] Fix conv2d alter op for arm cpu #5532 * [Fix] Fix dense x86 schedule #4728 * [Relay][Fix] Fix alter op layout when calling a global var #4454 * [Relay][Pass] Fix lambda lift pass for recursive call #4432 * [BUGFIX] Fix search path for libtvm_topi.so #4467 * [Bugfix] Fix Python debugger segfaults with TVM built with LLVM #5685 * [RUNTIME] Fix compile errors of OpenCL FPGA backend #4492 * [BUGFIX][BACKPORT-0.6][ARITH] Fix FloorMod Simplifier #5509 * Some Windows and MSVC fixes #4569 * [Chisel][VTA] Fix multiple transfer issue in LoadUop module #4442 * [VTA] Fix an issue in updating uop_idx in the TensorGemm module #4694 * [VTA] Fixed a crash issue in TSIM driver #4527 * [VTA] Enable streamlined GEMM execution #4392 * [VTA][Chisel] End-to-end Inference with Chisel VTA #4574 * Added declare of aluBits for TensorAlu #4624 * [Quantization] Fix annotation for multiply op #4458 * LRN only supports 4D tensors, remove it from alter_op_layout #5520 * fix topi.nn.global_pool layout=“NHWC” #4656 * [FFI][Windows] Fix hasattr by extracting Python error type from Windows error message #4780 * [Runtime] Export GraphRuntime in tvm_runtime.dll #5002 * Fix Base64OutStream portability issue #4668 * [AUTOTVM] Fix a bug in generating the search space #4779 * [Relay][VM] Fix compilation of If-Elses #5040 * [RELAY][FRONTEND][TENSORFLOW] Fix FuseBatchNorm output cast error if need_cast is True #4894 * [Bugfix] fskip of EliminateCommonSubexpr cannot always return false #4620 * [Fix] Add ConstantNode to IsAtomic #5457 * [Fix] Fix RemoveUnusedFunctions pass #4700 * [Realy][fix] Fix alpha_equal bug for attribute check #4897 * [BACKPORT-0.6][Bugfix][Arith] keep div_mode during floordiv simplify #5922 * [ARITH][BACKPORT-0.6] fix a min/max simplify bug #5761 * [0.6-BACKPORT] Improve robustness of the docs build #5583 --- [Visit Topic](https://discuss.tvm.ai/t/rfc-minor-bugfix-release-for-v0-6/6716/9) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/818c929ff558bfe03f26ebb01371b633cc398a26e63e090cfa1eaee14f5b196b).