On 26.02.2025 17:04, Roger Pau Monné wrote: > On Wed, Feb 26, 2025 at 03:36:33PM +0100, Jan Beulich wrote: >> On 26.02.2025 15:28, Roger Pau Monné wrote: >>> On Wed, Feb 26, 2025 at 03:08:33PM +0100, Jan Beulich wrote: >>>> On 26.02.2025 14:56, Roger Pau Monné wrote: >>>>> On Mon, Feb 24, 2025 at 01:27:24PM +0000, Alejandro Vallejo wrote: >>>>>> --- a/xen/common/page_alloc.c >>>>>> +++ b/xen/common/page_alloc.c >>>>>> @@ -490,13 +490,11 @@ static long outstanding_claims; /* total >>>>>> outstanding claims by all domains */ >>>>>> >>>>>> unsigned long domain_adjust_tot_pages(struct domain *d, long pages) >>>>>> { >>>>>> - long dom_before, dom_after, dom_claimed, sys_before, sys_after; >>>>>> - >>>>>> ASSERT(rspin_is_locked(&d->page_alloc_lock)); >>>>>> d->tot_pages += pages; >>>>>> >>>>>> /* >>>>>> - * can test d->claimed_pages race-free because it can only change >>>>>> + * can test d->outstanding_pages race-free because it can only >>>>>> change >>>>>> * if d->page_alloc_lock and heap_lock are both held, see also >>>>>> * domain_set_outstanding_pages below >>>>>> */ >>>>>> @@ -504,17 +502,16 @@ unsigned long domain_adjust_tot_pages(struct >>>>>> domain *d, long pages) >>>>>> goto out; >>>>> >>>>> I think you can probably short-circuit the logic below if pages == 0? >>>>> (and avoid taking the heap_lock) >>>> >>>> Are there callers passing in 0? >>> >>> Not sure, but if there are no callers expected we might add an ASSERT >>> to that effect then. >>> >>>>>> spin_lock(&heap_lock); >>>>>> - /* adjust domain outstanding pages; may not go negative */ >>>>>> - dom_before = d->outstanding_pages; >>>>>> - dom_after = dom_before - pages; >>>>>> - BUG_ON(dom_before < 0); >>>>>> - dom_claimed = dom_after < 0 ? 0 : dom_after; >>>>>> - d->outstanding_pages = dom_claimed; >>>>>> - /* flag accounting bug if system outstanding_claims would go >>>>>> negative */ >>>>>> - sys_before = outstanding_claims; >>>>>> - sys_after = sys_before - (dom_before - dom_claimed); >>>>>> - BUG_ON(sys_after < 0); >>>>>> - outstanding_claims = sys_after; >>>>>> + BUG_ON(outstanding_claims < d->outstanding_pages); >>>>>> + if ( pages > 0 && d->outstanding_pages < pages ) >>>>>> + { >>>>>> + /* `pages` exceeds the domain's outstanding count. Zero it out. >>>>>> */ >>>>>> + outstanding_claims -= d->outstanding_pages; >>>>>> + d->outstanding_pages = 0; >>>>>> + } else { >>>>>> + outstanding_claims -= pages; >>>>>> + d->outstanding_pages -= pages; >>>>> >>>>> I wonder if it's intentional for a pages < 0 value to modify >>>>> outstanding_claims and d->outstanding_pages, I think those values >>>>> should only be set from domain_set_outstanding_pages(). >>>>> domain_adjust_tot_pages() should only decrease the value, but never >>>>> increase either outstanding_claims or d->outstanding_pages. >>>>> >>>>> At best the behavior is inconsistent, because once >>>>> d->outstanding_pages reaches 0 there will be no further modification >>>>> from domain_adjust_tot_pages(). >>>> >>>> Right, at that point the claim has run out. While freeing pages with an >>>> active claim means that the claim gets bigger (which naturally needs >>>> reflecting in the global). >>> >>> domain_adjust_tot_pages() is not exclusively called when freeing >>> pages, see steal_page() for example. >> >> Or also when pages were allocated. steal_page() ... >> >>> When called from steal_page() it's wrong to increase the claim, as >>> it assumes that the page removed from d->tot_pages is freed, but >>> that's not the case. The domain might end up in a situation where >>> the claim is bigger than the available amount of memory. >> >> ... is a case that likely wasn't considered when the feature was added. >> >> I never really liked this; I'd be quite happy to see it ripped out, as >> long as we'd be reasonably certain it isn't in active use by people. > > What do you mean with 'it' in the above sentence, the whole claim > stuff?
Yes. > Or just getting rid of allowing the claim to increase as a > result of pages being removed from a domain? No. Jan