Paolo Bonzini <pbonz...@redhat.com> writes: > On 26/07/2016 14:09, Alex Bennée wrote: >> >> As the eventual operation is the setting of a flag I'm wondering if we >> can simply use atomic primitives to ensure we don't corrupt the lookup >> address when setting the TLB_NOTDIRTY flag? > > In theory tlb_reset_dirty and tlb_set_dirty1 can use atomic_* on > tlb_entry->addr_write, but careful because: > > - you need to order reads and writes to tlb_entry->addr_write and > tlb_entry->addend properly > > - addr_write cannot be written atomically for 32-bit host/64-bit target. > Probably you can use something like > > union { > target_ulong addr_write; > #if TARGET_LONG_BITS == 32 > struct { uint32_t lo_and_lfags; } addr_write_w; > #elif defined HOST_WORDS_BIGENDIAN > struct { uint32_t hi, lo_and_flags; } addr_write_w; > #else > struct { uint32_t lo_and_flags, hi; } addr_write_w; > #endif > };
This will work but I wonder if it is time to call it a day for 32 on 64 support? I mean all this can be worked around but I wonder if it is worth the effort if no one actually uses this combination. I might prepare a patch for the next dev cycle to promote discussion. > > IIRC "foreign" accesses only set TLB_NOTDIRTY, so they can use a cmpxchg > on lo_and_flags (worst case you end up with an unnecessary call to > notdirty_mem_write). Yeah I used a cmpxchg in the RFC patch. AFAICT you wouldn't see a change to addend that didn't involve addr_write changing. So as long as addr_write was always the last thing written things should be fine. > > - When removing TLB_NOTDIRTY from a TLB entry > (notdirty_mem_write/tlb_unprotect_code), as well as filling in a TLB > entry without TLB_NOTDIRTY (tlb_set_page_with_attrs) you need to protect > from a concurrent tb_alloc_page and hence take the tb_lock. tlb_set_page_with_attrs is also only ever filled by the vCPU in question so I just ensured the final addr_write was calculated and written in one go at the end, after all other entries had been updated. > > In particular: > > - in notdirty_mem_write, care must be put in the ordering of > tb_invalidate_phys_page_fast (which itself calls tlb_unprotect_code and > takes the tb_lock in tb_invalidate_phys_page_range) and tlb_set_dirty. > At least it seems to me that the call to tb_invalidate_phys_page_fast > should be after the write, but that's not all. Perhaps merge this part > of notdirty_mem_write: > > /* Set both VGA and migration bits for simplicity and to remove > * the notdirty callback faster. > */ > cpu_physical_memory_set_dirty_range(ram_addr, size, > DIRTY_CLIENTS_NOCODE); > /* we remove the notdirty callback only if the code has been > flushed */ > if (!cpu_physical_memory_is_clean(ram_addr)) { > tlb_set_dirty(current_cpu, current_cpu->mem_io_vaddr); > } > > into tlb_unprotect_code?!? Or perhaps do tlb_set_dirty _first_, and > then add back the callback if cpu_physical_memory_is_clean(ram_addr) is > true. I haven't put much thought into it. I'll have a closer look at these bits. > > - tlb_set_page_with_attrs is also hard-ish to get right, but perhaps the > same idea of adding the callback last would work: > > /* First set addr_write so that concurrent tlb_reset_dirty_range > * finds a match. > */ > te->addr_write = address; > if (memory_region_is_ram(section->mr)) { > if (cpu_physical_memory_is_clean( > memory_region_get_ram_addr(section->mr) + xlat)) { > te->addr_write = address | TLB_NOTDIRTY; > } > } Why not just write addr_write completely at the end? > > Paolo > >> Of course the TLB structure itself covers a number of values but AFAICT >> erroneously setting TLB_NOTDIRTY on a entry that gets updated to a new >> address wouldn't cause a problem except triggering an additional >> slow-path write. If we are careful about the filling of the TLB entries >> can we be sure we are always safe? -- Alex Bennée