Hi Christian,
On 9/9/2025 9:55 PM, Christian König wrote:
On 09.09.25 16:05, Peter Zijlstra wrote:
On Tue, Sep 09, 2025 at 02:04:30PM +0200, Christian König wrote:
Hi Arun,
On 09.09.25 11:56, Arunpravin Paneer Selvam wrote:
[SNIP]
+/**
+ * rbtree_for_each_entry_safe - iterate in-order over rb_root safe against
removal
+ *
+ * @pos: the 'type *' to use as a loop cursor
+ * @n: another 'type *' to use as temporary storage
+ * @root: 'rb_root *' of the rbtree
+ * @member: the name of the rb_node field within 'type'
+ */
+#define rbtree_for_each_entry_safe(pos, n, root, member) \
+ for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \
+ (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member),
typeof(*(pos)), member) : NULL; \
+ (pos); \
+ (pos) = (n), \
+ (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member),
typeof(*(pos)), member) : NULL)
As far as I know exactly that operation does not work on an R/B tree.
See the _safe() variants of the for_each_ macros are usually used to iterate
over a container while being able to remove entries.
But because of the potential re-balance storing just the next entry is not
sufficient for an R/B tree to do that as far as I know.
Please explain how exactly you want to use this macro.
Thanks for the pointer, yes, this will not work on RB tree. We need a
reverse safe variant for use in the force_merge() function similar to the
list_for_each_entry_safe_reverse() macro in list.h. The reason is that
in force_merge(), we remove the block from the free tree before invoking
drm_buddy_free(), which merges and frees buddy blocks to form a larger
block.
So I don't much like these iterators; I've said so before. Either we
should introduce a properly threaded rb-tree (where the NULL child
pointers encode a linked list), or simply keep a list_head next to the
rb_node and use that.
I agree, something is clearly fishy here.
The rb_{next,prev}() things are O(ln n), in the worst case they do a
full traversal up the tree and a full traversal down the other branch.
Yeah from the logic that is exactly what is supposed to happen in the
__force_merge() function.
The question is rather why does that function exists in the first place? The
operation doesn't look logical to me.
For drm_buddy_reset_clear() and drm_buddy_fini() we should use
rbtree_postorder_for_each_entry_safe() instead.
And during normal allocation __force_merge() should never be used.
In normal allocation, the force_merge() function is used when no free
blocks of the requested order are available. In such cases,
smaller blocks must be merged on demand to satisfy the allocation.
Mainly, this does not involve traversing the entire tree to
merge all blocks, but only merging as needed. For example, if the
requested order is 6, and the minimum order is 5, drm_buddy_alloc_blocks()
will first attempt to allocate an order-6 block. If none are available,
it will try to allocate two order-5 blocks. If those are also
unavailable, it will
invoke force_merge() to merge lower order blocks (4, 3, 2, 1, 0) in
order to coalesce into a higher-order block of order 5.
In drm_buddy_fini(), force_merge() is called to ensure all blocks are
merged before tearing down the allocator. This guarantees that all
mm->roots are freed and not held by the driver at shutdown. If any
blocks remain allocated, drm_buddy_fini() will issue a warning.
In drm_buddy_reset_clear(), which is invoked at device suspend/resume,
it is an ideal place to call force_merge(). This ensures that all
possible blocks are merged before resetting the clear state, thereby
reducing fragmentation and improving allocation efficiency after resume.
I tried using this rbtree_postorder_for_each_entry_safe() macro in the
force_merge() and it works, but we also a need a reverse variant
since in normal allocation we dont want to disturb the lower addresses.
Thanks,
Arun.
That said; given 'next' will remain an existing node, only the 'pos'
node gets removed, rb_next() will still work correctly, even in the face
of rebalance.
Good to know!
Regards,
Christian.