On Sun, Jan 22, 2017 at 04:51:18PM +0800, Peter Xu wrote: > On Sun, Jan 22, 2017 at 03:56:10PM +0800, Jason Wang wrote: > > [...] > > > >+/** > > >+ * vtd_page_walk_level - walk over specific level for IOVA range > > >+ * > > >+ * @addr: base GPA addr to start the walk > > >+ * @start: IOVA range start address > > >+ * @end: IOVA range end address (start <= addr < end) > > >+ * @hook_fn: hook func to be called when detected page > > >+ * @private: private data to be passed into hook func > > >+ * @read: whether parent level has read permission > > >+ * @write: whether parent level has write permission > > >+ * @skipped: accumulated skipped ranges > > > > What's the usage for this parameter? Looks like it was never used in this > > series. > > This was for debugging purpose before, and I kept it in case one day > it can be used again, considering that will not affect much on the > overall performance. > > > > > >+ * @notify_unmap: whether we should notify invalid entries > > >+ */ > > >+static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, > > >+ uint64_t end, vtd_page_walk_hook hook_fn, > > >+ void *private, uint32_t level, > > >+ bool read, bool write, uint64_t *skipped, > > >+ bool notify_unmap) > > >+{ > > >+ bool read_cur, write_cur, entry_valid; > > >+ uint32_t offset; > > >+ uint64_t slpte; > > >+ uint64_t subpage_size, subpage_mask; > > >+ IOMMUTLBEntry entry; > > >+ uint64_t iova = start; > > >+ uint64_t iova_next; > > >+ uint64_t skipped_local = 0; > > >+ int ret = 0; > > >+ > > >+ trace_vtd_page_walk_level(addr, level, start, end); > > >+ > > >+ subpage_size = 1ULL << vtd_slpt_level_shift(level); > > >+ subpage_mask = vtd_slpt_level_page_mask(level); > > >+ > > >+ while (iova < end) { > > >+ iova_next = (iova & subpage_mask) + subpage_size; > > >+ > > >+ offset = vtd_iova_level_offset(iova, level); > > >+ slpte = vtd_get_slpte(addr, offset); > > >+ > > >+ /* > > >+ * When one of the following case happens, we assume the whole > > >+ * range is invalid: > > >+ * > > >+ * 1. read block failed > > > > Don't get the meaning (and don't see any code relate to this comment). > > I took above vtd_get_slpte() a "read", so I was trying to mean that we > failed to read the SLPTE due to some reason, we assume the range is > invalid. > > > > > >+ * 3. reserved area non-zero > > >+ * 2. both read & write flag are not set > > > > Should be 1,2,3? And the above comment is conflict with the code at least > > when notify_unmap is true. > > Yes, okay I don't know why 3 jumped. :( > > And yes, I should mention that "both read & write flag not set" only > suites for page tables here. > > > > > >+ */ > > >+ > > >+ if (slpte == (uint64_t)-1) { > > > > If this is true, vtd_slpte_nonzero_rsvd(slpte) should be true too I think? > > Yes, but we are doing two checks here: > > - checking against -1 to make sure slpte read success > - checking against nonzero reserved fields to make sure it follows spec > > Imho we should not skip the first check here, only if one day removing > this may really matter (e.g., for performance reason? I cannot think > of one yet). > > > > > >+ trace_vtd_page_walk_skip_read(iova, iova_next); > > >+ skipped_local++; > > >+ goto next; > > >+ } > > >+ > > >+ if (vtd_slpte_nonzero_rsvd(slpte, level)) { > > >+ trace_vtd_page_walk_skip_reserve(iova, iova_next); > > >+ skipped_local++; > > >+ goto next; > > >+ } > > >+ > > >+ /* Permissions are stacked with parents' */ > > >+ read_cur = read && (slpte & VTD_SL_R); > > >+ write_cur = write && (slpte & VTD_SL_W); > > >+ > > >+ /* > > >+ * As long as we have either read/write permission, this is > > >+ * a valid entry. The rule works for both page or page tables. > > >+ */ > > >+ entry_valid = read_cur | write_cur; > > >+ > > >+ if (vtd_is_last_slpte(slpte, level)) { > > >+ entry.target_as = &address_space_memory; > > >+ entry.iova = iova & subpage_mask; > > >+ /* > > >+ * This might be meaningless addr if (!read_cur && > > >+ * !write_cur), but after all this field will be > > >+ * meaningless in that case, so let's share the code to > > >+ * generate the IOTLBs no matter it's an MAP or UNMAP > > >+ */ > > >+ entry.translated_addr = vtd_get_slpte_addr(slpte); > > >+ entry.addr_mask = ~subpage_mask; > > >+ entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur); > > >+ if (!entry_valid && !notify_unmap) { > > >+ trace_vtd_page_walk_skip_perm(iova, iova_next); > > >+ skipped_local++; > > >+ goto next; > > >+ } > > > > Under which case should we care about unmap here (better with a comment I > > think)? > > When PSIs are for invalidation, rather than newly mapped entries. In > that case, notify_unmap will be true, and here we need to notify > IOMMU_NONE to do the cache flush or unmap. > > (this page walk is not only for replaying, it is for cache flushing as > well) > > Do you have suggestion on the comments?
Besides this one, I tried to fix the comments in this function as below, hope this is better (I removed 1-3 thing since I think that's clearer from below code): diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index e958f53..f3fe8c4 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -735,15 +735,6 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, offset = vtd_iova_level_offset(iova, level); slpte = vtd_get_slpte(addr, offset); - /* - * When one of the following case happens, we assume the whole - * range is invalid: - * - * 1. read block failed - * 3. reserved area non-zero - * 2. both read & write flag are not set - */ - if (slpte == (uint64_t)-1) { trace_vtd_page_walk_skip_read(iova, iova_next); skipped_local++; @@ -761,20 +752,16 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, write_cur = write && (slpte & VTD_SL_W); /* - * As long as we have either read/write permission, this is - * a valid entry. The rule works for both page or page tables. + * As long as we have either read/write permission, this is a + * valid entry. The rule works for both page entries and page + * table entries. */ entry_valid = read_cur | write_cur; if (vtd_is_last_slpte(slpte, level)) { entry.target_as = &address_space_memory; entry.iova = iova & subpage_mask; - /* - * This might be meaningless addr if (!read_cur && - * !write_cur), but after all this field will be - * meaningless in that case, so let's share the code to - * generate the IOTLBs no matter it's an MAP or UNMAP - */ + /* NOTE: this is only meaningful if entry_valid == true */ entry.translated_addr = vtd_get_slpte_addr(slpte); entry.addr_mask = ~subpage_mask; entry.perm = IOMMU_ACCESS_FLAG(read_cur, write_cur); Thanks, -- peterx