On Fri, Apr 13, 2018 at 17:29:20 -1000, Richard Henderson wrote: > On 04/05/2018 04:13 PM, Emilio G. Cota wrote: (snip) > > +struct page_collection { > > + GTree *tree; > > + struct page_entry *max; > > +}; > > I don't understand the motivation for this data structure. Substituting one > tree for another does not, on the face of it, seem to be a win. > > Given that your locking order is based on the physical address, I can > understand that the sequential virtual addresses that these routines are given > is not appropriate. But surely you should know how many pages are involved, > and therefore be able to allocate a flat array to hold the PageDesc. > > > +/* > > + * Lock a range of pages ([@start,@end[) as well as the pages of all > > + * intersecting TBs. > > + * Locking order: acquire locks in ascending order of page index. > > + */ > > I don't think I understand this either. From whence do you wind up with a > range of physical addresses?
For instance in tb_invalidate_phys_page_range. We need to invalidate all TBs associated with a range of phys addresses. I am not sure how an array would make things easier, since we need to lock the pages in the given range, as well as the pages that overlap with the TBs in said range (since we'll invalidate the TBs). For example, if we have to invalidate all TBs in the range A-E, it is possible that a TB in page C will overlap with page K (not in the original range), so we'll have to lock page K as well. All of this needs to be done in order, that is, A-E,K. If we had an array, we'd have to resize the array anytime we had an out-of-range page, and then do a binary search in the array to check whether we already locked that page. At this point we'd be reinventing a binary tree, so it seems simpler to just use a tree. > > +struct page_collection * > > +page_collection_lock(tb_page_addr_t start, tb_page_addr_t end) > > ... > > > + /* > > + * Add the TB to the page list. > > + * To avoid deadlock, acquire first the lock of the lower-addressed > > page. > > + */ > > + p = page_find_alloc(phys_pc >> TARGET_PAGE_BITS, 1); > > + if (likely(phys_page2 == -1)) { > > tb->page_addr[1] = -1; > > + page_lock(p); > > + tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK); > > + } else { > > + p2 = page_find_alloc(phys_page2 >> TARGET_PAGE_BITS, 1); > > + if (phys_pc < phys_page2) { > > + page_lock(p); > > + page_lock(p2); > > + } else { > > + page_lock(p2); > > + page_lock(p); > > + } > > Extract this as a helper for use here and page_lock_tb? Done. Alex already suggested this when reviewing v1; I should have done it then instead of resisting. Fixup appended. > > /* > > * Invalidate all TBs which intersect with the target physical address > > range > > + * [start;end[. NOTE: start and end must refer to the *same* physical page. > > + * 'is_cpu_write_access' should be true if called from a real cpu write > > + * access: the virtual CPU will exit the current TB if code is modified > > inside > > + * this TB. > > + * > > + * Called with tb_lock/mmap_lock held for user-mode emulation > > + * Called with tb_lock held for system-mode emulation > > + */ > > +void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t > > end, > > + int is_cpu_write_access) > > FWIW, we should probably notice and optimize end = start + 1, which appears to > have the largest number of users for e.g. watchpoints. This is also the case when booting linux (~99% of cases). Once we agree on the correctness of the whole thing we can look into making the common case faster, if necessary. Thanks, Emilio --- diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index f6ff087..9b21c1a 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -549,6 +549,9 @@ static inline PageDesc *page_find(tb_page_addr_t index) return page_find_alloc(index, 0); } +static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1, + PageDesc **ret_p2, tb_page_addr_t phys2, int alloc); + /* In user-mode page locks aren't used; mmap_lock is enough */ #ifdef CONFIG_USER_ONLY @@ -682,17 +685,7 @@ static inline void page_unlock(PageDesc *pd) /* lock the page(s) of a TB in the correct acquisition order */ static inline void page_lock_tb(const TranslationBlock *tb) { - if (likely(tb->page_addr[1] == -1)) { - page_lock(page_find(tb->page_addr[0] >> TARGET_PAGE_BITS)); - return; - } - if (tb->page_addr[0] < tb->page_addr[1]) { - page_lock(page_find(tb->page_addr[0] >> TARGET_PAGE_BITS)); - page_lock(page_find(tb->page_addr[1] >> TARGET_PAGE_BITS)); - } else { - page_lock(page_find(tb->page_addr[1] >> TARGET_PAGE_BITS)); - page_lock(page_find(tb->page_addr[0] >> TARGET_PAGE_BITS)); - } + page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], 0); } static inline void page_unlock_tb(const TranslationBlock *tb) @@ -871,6 +864,33 @@ void page_collection_unlock(struct page_collection *set) #endif /* !CONFIG_USER_ONLY */ +static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1, + PageDesc **ret_p2, tb_page_addr_t phys2, int alloc) +{ + PageDesc *p1, *p2; + + g_assert(phys1 != -1 && phys1 != phys2); + p1 = page_find_alloc(phys1 >> TARGET_PAGE_BITS, alloc); + if (ret_p1) { + *ret_p1 = p1; + } + if (likely(phys2 == -1)) { + page_lock(p1); + return; + } + p2 = page_find_alloc(phys2 >> TARGET_PAGE_BITS, alloc); + if (ret_p2) { + *ret_p2 = p2; + } + if (phys1 < phys2) { + page_lock(p1); + page_lock(p2); + } else { + page_lock(p2); + page_lock(p1); + } +} + #if defined(CONFIG_USER_ONLY) /* Currently it is not recommended to allocate big chunks of data in user mode. It will change when a dedicated libc will be used. */ @@ -1600,22 +1620,12 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc, * Note that inserting into the hash table first isn't an option, since * we can only insert TBs that are fully initialized. */ - p = page_find_alloc(phys_pc >> TARGET_PAGE_BITS, 1); - if (likely(phys_page2 == -1)) { - tb->page_addr[1] = -1; - page_lock(p); - tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK); - } else { - p2 = page_find_alloc(phys_page2 >> TARGET_PAGE_BITS, 1); - if (phys_pc < phys_page2) { - page_lock(p); - page_lock(p2); - } else { - page_lock(p2); - page_lock(p); - } - tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK); + page_lock_pair(&p, phys_pc, &p2, phys_page2, 1); + tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK); + if (p2) { tb_page_add(p2, tb, 1, phys_page2); + } else { + tb->page_addr[1] = -1; } /* add in the hash table */