hi @manupa-arm, thanks for posting this! there's a lot to unpack here.

I think we can break the work here into two parts:

P1. Implementing the unified memory planner based on information in TIR

P2. Modifying the codegen/output to implement various compiler optimizations 
based on P1.

I think that the debate around P1 is likely to center around the "how," whereas 
the debate around P2 is likely to center around the "what." 

## Modeling the whole program in TIR

So far the AOT effort has made some initial effort here by creating a top-level 
TIR function which describes the top-level model. One open question related to 
this RFC is: how should we structure the compiler around this top-level 
program? In general, we have a couple of options:

S1. Place everything in TIR, and implement post-scheduling transforms as 
compiler passes. In the S1 world, any computed information e.g. memory 
placement for buffers would need to live in TIR. In this world, we should 
strive to avoid side-channel information carried outside of TIR.

S2. Keep with the piecewise representation, and build separate data structures 
to encapsulate compiler outputs from post-schedule passes e.g. memory planning.

I think currently @jroesch and @csullivan support S1 (see [PR 
7518](https://github.com/apache/tvm/pull/7518), which my understanding says is 
still being worked on but which is often merge-conflicted). I also support this 
if it's feasible to do so under all executors. I think the drawback is that 
non-AOT executors will need to run these passes, but the advantage is that it 
provides a clear framework under which we can consolidate post-scheduling 
whole-program modeling for both AOT and non-AOT use cases. Should we consider 
superseding VM executor with AOT in the future, it also provides a more natural 
pathway. I'm curious as to your opinions on this?

I bring this up because I think a lot of questions raised here and elsewhere in 
the proposal can likely be decided based on how we decide this general design 
pattern.

## Inline questions

A couple other questions:

>     static int32_t entrypoint(TVMInputs_my_model* inputs, 
>                               TVMOutputs_my_model* outputs,
>                               TVMContext* context){

Just to confirm--would TVMContext also be generated e.g. `TVMContext_my_model`

> Inputs :
> 
> * AoT TIR PrimFunc ( the control function describing the call graph to 
> operators)
> * All Operator Functions
> * the maximum size for each pool We could use “pinned_memory” (see below) to 
> tag buffers with suggested priority order determined by the scheduler.
> 
> The idea is USMP will try to pool them using the preferred “pinned_memory” 
> and fallback whenever the size is exceeding the user provided max size for 
> each pool (if any)
> 
> Outputs :
> 
> * AoT TIR PrimFunc accepting pool buffers from the user.
> * All Operator functions accepting pool buffers.
>   * Each operator function should address using the correct offset in the 
> correct pool buffer

I'm not certain the memory planner should necessarily encode all vars as buffer 
offsets--doing so could limit e.g. dynamic use cases, which may either a) need 
to express offsets as runtime-evaluated expressions or b) need to entirely 
defer such allocations to runtime, should it be impossible to pre-define such 
expressions.

This gets at my separation of concerns above--it would be nice to either
1. use the TIR-agnostic I/O format as a way to store the memory planner output 
and then inform further TIR modifications (e.g. either making everything buffer 
offsets when possible, passing those offsets in as positional arguments, or 
keeping TVMBAW for dynamic allocs)
2. represent that abstract output as e.g. TIR attributes and perform any of the 
aforementioned optimizations by examining TIR attributes

> The current proposal for the interface is as follows :
> 
> ```
> struct BufferInfo {
>     Integer uid;
>     Integer size_bytes;
>     Integer alignment;
>     Array<Integer> conflicts; //the conflicting uids of buffers`
>     Array<Integer> pool_candidates;`
>     Integer pool_id;`
>     Integer pool_offset;`
> }
> ```
> 
> void (*foo)(Array buffers, Map<Integer, Integer> pool_sizes)

In the tvmc command above, memory pools were identified by name. Any reason to 
translate to integers here?

>  ## Special Considerations :

Let's discuss these after resolving S1/S2 debate above.

cc @tqchen @junrushao1994 f you have comments on representing this in TIR





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/rfc-unified-static-memory-planning/10099/2)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/c5316785efc8d59bbf15e9b6d7ffe42f674f2dd0864c28c1a63e659325382f32).

Reply via email to