grigorypas wrote:

> > Can you please elaborate what do you mean by it "changes the semantics of 
> > alwaysinline"? I am introducing a new attribute flatten_deep both on clang 
> > side and LLVM side. alwaysinline should still mean the same thing.
> 
> You said patch 2 will update the alwaysinliner pass. `alwaysinline` has 
> previously always inlined a function unless it was illegal to do so. You're 
> now maybe not inlining depending on the `flatten_deep` attribute, which seems 
> like a cost heuristic encoded in the IR to me.
> 
> > To clarify, our primary use case at Meta is to completely flatten functions 
> > by inlining the entire call tree. The max depth parameter is not intended 
> > as a core part of the user workflow, but rather as a safeguard to prevent 
> > issues if the call tree happens to be extremely deep.
> 
> So you want to completely flatten functions but not completely flatten 
> functions? What exactly is the use case of flattening these functions?

Thank you for the feedback! Let me clarify the design:
## `alwaysinline` Semantics Are Preserved
The `alwaysinline` semantics are **not** being changed. The original 
`alwaysinline` logic is applied first and takes precedence. The `flatten_deep` 
logic runs in the same pass but is applied at the end, after the standard 
`alwaysinline` processing. If a function has `alwaysinline`, it will be inlined 
according to the existing rules (unless illegal to do so), completely 
independent of any `flatten_deep` attributes.
You can see this in the suggested implementation here: 
[https://github.com/grigorypas/llvm-project/tree/full_flattening](https://github.com/grigorypas/llvm-project/tree/full_flattening)
## `flatten_deep` as a Natural Extension of `flatten`
`flatten_deep(N)` is a natural extension of the existing `flatten` attribute. 
While they differ in implementation, the motivation is similar:
- **`flatten`**: Inlines all immediate callsites (single level) - implemented 
at frontend by marking direct calls with `alwaysinline`
- **`flatten_deep(N)`**: Inlines recursively/transitively up to N levels deep - 
requires backend support to propagate through the call tree
Importantly, **full/deep flattening cannot be achieved today with existing 
attributes**. You can't achieve transitive inlining across the entire call tree 
with current mechanisms.
## Max Depth as a Safeguard
The max depth parameter is not a cost heuristic - it's a safety limit:
- **Primary use case**: Complete flattening of the call tree (large N)
- **Max depth parameter**: A safeguard to prevent compile-time explosions with 
unexpectedly deep call trees
This is similar to other compiler safety limits (e.g., `-fconstexpr-depth=N`) - 
we want to flatten the entire call tree in normal cases, but need a circuit 
breaker for pathological edge cases.
## Use Case
This feature is useful for performance-critical code where eliminating call 
overhead across the entire call tree is beneficial, such as:
- Deeply nested hot paths in performance-sensitive applications
- **PGO scenarios with stale profiles**: When adding new functions to hot 
paths, `flatten_deep(N)` may help where default bottom-up inlining decisions 
rely on incomplete or stale profile data
Does this clarification address your concerns?

https://github.com/llvm/llvm-project/pull/165777
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to