On 07/08/2015 19:03, Alvise Rigo wrote: > +static inline int cpu_physical_memory_excl_atleast_one_clean(ram_addr_t addr) > +{ > + unsigned long *bitmap = ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]; > + unsigned long next, end; > + > + if (likely(smp_cpus <= BITS_PER_LONG)) {
This only works if smp_cpus divides BITS_PER_LONG, i.e. BITS_PER_LONG % smp_cpus == 0. > + unsigned long mask = (1 << smp_cpus) - 1; > + > + return > + (mask & (bitmap[BIT_WORD(EXCL_BITMAP_GET_OFFSET(addr))] >> > + (EXCL_BITMAP_GET_OFFSET(addr) & (BITS_PER_LONG-1)))) != mask; > + } > + > + end = BIT_WORD(EXCL_BITMAP_GET_OFFSET(addr)) + smp_cpus; > + next = find_next_zero_bit(bitmap, end, > + BIT_WORD(EXCL_BITMAP_GET_OFFSET(addr))); > + > + return next < end; > +static inline int cpu_physical_memory_excl_is_dirty(ram_addr_t addr, > + unsigned long cpu) > +{ > + unsigned long *bitmap = ram_list.dirty_memory[DIRTY_MEMORY_EXCLUSIVE]; > + unsigned long end, next; > + uint32_t add; > + > + assert(cpu <= smp_cpus); > + > + if (likely(smp_cpus <= BITS_PER_LONG)) { > + cpu = (cpu == smp_cpus) ? (1 << cpu) - 1 : (1 << cpu); > + > + return cpu & (bitmap[BIT_WORD(EXCL_BITMAP_GET_OFFSET(addr))] >> > + (EXCL_BITMAP_GET_OFFSET(addr) & (BITS_PER_LONG-1))); > + } > + > + add = (cpu == smp_cpus) ? 0 : 1; Why not have a separate function for the cpu == smp_cpus case? I don't think real hardware has ll/sc per CPU. Can we have the bitmap as: - 0 if one or more CPUs have the address set to exclusive, _and_ no CPU has done a concurrent access - 1 if no CPUs have the address set to exclusive, _or_ one CPU has done a concurrent access. Then: - ll sets the bit to 0, and requests a flush if it was 1 - when setting a TLB entry, set it to TLB_EXCL if the bitmap has 0 - in the TLB_EXCL slow path, set the bit to 1 and, for conditional stores, succeed if the bit was 0 - when removing an exclusive entry, set the bit to 1 Paolo