Andrea Arcangeli wrote:
So I never found a relation to the symptom reported of VM kernel
threads going weird, with KVM optimal handling of kmap ptes.
The problem is this code:
static int scan_active_list(struct zone_struct * zone, int age,
struct list_head * list)
{
struct list_head *page_lru , *next;
struct page * page;
int over_rsslimit;
/* Take the lock while messing with the list... */
lru_lock(zone);
list_for_each_safe(page_lru, next, list) {
page = list_entry(page_lru, struct page, lru);
pte_chain_lock(page);
if (page_referenced(page, &over_rsslimit) && !over_rsslimit)
age_page_up_nolock(page, age);
pte_chain_unlock(page);
}
lru_unlock(zone);
return 0;
}
If the pages in the list are in the same order as in the ptes (which is
very likely), then we have the following access pattern
- set up kmap to point at pte
- test_and_clear_bit(pte)
- kunmap
From kvm's point of view this looks like
- several accesses to set up the kmap
- if these accesses trigger flooding, we will have to tear down the
shadow for this page, only to set it up again soon
- an access to the pte (emulted)
- if this access _doesn't_ trigger flooding, we will have 512 unneeded
emulations. The pte is worthless anyway since the accessed bit is clear
(so we can't set up a shadow pte for it)
- this bug was fixed
- an access to tear down the kmap
[btw, am I reading this right? the entire list is scanned each time?
if you have 1G of active HIGHMEM, that's a quarter of a million pages,
which would take at least a second no matter what we do. VMware can
probably special-case kmaps, but we can't]
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html