On Wed, Sep 30, 2015 at 5:34 AM, Richard Henderson <r...@twiddle.net> wrote: > On 09/24/2015 06:32 PM, Alvise Rigo wrote: >> >> + if (unlikely(!(te->addr_write & TLB_MMIO) && (te->addr_write & >> TLB_EXCL))) { >> + /* We are removing an exclusive entry, set the page to dirty. >> This >> + * is not be necessary if the vCPU has performed both SC and LL. >> */ >> + hwaddr hw_addr = (env->iotlb[mmu_idx][index].addr & >> TARGET_PAGE_MASK) + >> + (te->addr_write & >> TARGET_PAGE_MASK); >> + cpu_physical_memory_set_excl_dirty(hw_addr, cpu->cpu_index); >> + } > > > Um... this seems dangerous. > > (1) I don't see why EXCL support should differ whether MMIO is set or not. > Either we support exclusive accesses on memory-mapped io like we do on ram > (in which case this is wrong) or we don't (in which case this is > unnecessary).
I was not sure whether or not we had to support also MMIO memory. In theory there shouldn't be any issues for including also memory-mapped io, I will consider this for the next version. > > (2) Doesn't this prevent a target from accessing a page during a ll/sc > sequence that aliases within our trivial hash? Such a case on arm might be > > mov r3, #0x100000 > ldrex r0, [r2] > ldr r1, [r2, r3] > add r0, r0, r1 > strex r0, [r2] > I'm not sure I got it. When the CPU issues the ldrex the page will be set as "clean" (meaning that all the CPUs will then follow the slow-path for that page) and the exclusive range - [r2, r2+4] in this case - is stored in the CPU state. The forced slow-path is used to check if the normal store is hitting the exclusive range of any CPUs, the normal loads are not affected. I don't see any problem in the code above, what am I missing? > AFAIK, Alpha is the only target we have which specifies that any normal > memory access during a ll+sc sequence may fail the sc. I will dig into it because I remember that the Alpha architecture behaves like ARM in the handling of LDxL/STxC instructions. > > (3) I'm finding the "clean/dirty" words less helpful than they could be, > especially since "clean" implies "some cpu has an excl lock on the page", > which is reverse of what seems natural but understandable given the > implementation. Perhaps we could rename these helpers? > >> @@ -376,6 +392,28 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, >> target_ulong addr) >> return qemu_ram_addr_from_host_nofail(p); >> } >> >> +/* Atomic insn translation TLB support. */ >> +#define EXCLUSIVE_RESET_ADDR ULLONG_MAX >> +/* For every vCPU compare the exclusive address and reset it in case of a >> + * match. Since only one vCPU is running at once, no lock has to be held >> to >> + * guard this operation. */ >> +static inline void lookup_and_reset_cpus_ll_addr(hwaddr addr, hwaddr >> size) >> +{ >> + CPUState *cpu; >> + CPUArchState *acpu; >> + >> + CPU_FOREACH(cpu) { >> + acpu = (CPUArchState *)cpu->env_ptr; >> + >> + if (acpu->excl_protected_range.begin != EXCLUSIVE_RESET_ADDR && >> + ranges_overlap(acpu->excl_protected_range.begin, >> + acpu->excl_protected_range.end - >> acpu->excl_protected_range.begin, >> + addr, size)) { > > > Watch the indentation here... it ought to line up with the previous argument > on the line above, just after the (. This may require you split the > subtract across the line too but that's ok. OK, I will fix it. > > > >> void dump_exec_info(FILE *f, fprintf_function cpu_fprintf); >> void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf); >> diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h >> index 98b9cff..a67f295 100644 >> --- a/include/exec/cpu-defs.h >> +++ b/include/exec/cpu-defs.h >> @@ -27,6 +27,7 @@ >> #include <inttypes.h> >> #include "qemu/osdep.h" >> #include "qemu/queue.h" >> +#include "qemu/range.h" >> #include "tcg-target.h" >> #ifndef CONFIG_USER_ONLY >> #include "exec/hwaddr.h" >> @@ -150,5 +151,16 @@ typedef struct CPUIOTLBEntry { >> #define CPU_COMMON >> \ >> /* soft mmu support */ >> \ >> CPU_COMMON_TLB >> \ >> + \ >> + /* Used by the atomic insn translation backend. */ \ >> + int ll_sc_context; \ >> + /* vCPU current exclusive addresses range. >> + * The address is set to EXCLUSIVE_RESET_ADDR if the vCPU is not. >> + * in the middle of a LL/SC. */ \ >> + struct Range excl_protected_range; \ >> + /* Used to carry the SC result but also to flag a normal (legacy) >> + * store access made by a stcond (see softmmu_template.h). */ \ >> + int excl_succeeded; \ > > > > This seems to be required by softmmu_template.h? In which case this must be > in CPU_COMMON_TLB. > > Please expand on this "legacy store access" comment here. I don't > understand. ACK. > > >> + >> >> #endif >> diff --git a/softmmu_template.h b/softmmu_template.h >> index d42d89d..e4431e8 100644 >> --- a/softmmu_template.h >> +++ b/softmmu_template.h >> @@ -409,19 +409,53 @@ void helper_le_st_name(CPUArchState *env, >> target_ulong addr, DATA_TYPE val, >> tlb_addr = env->tlb_table[mmu_idx][index].addr_write; >> } >> >> - /* Handle an IO access. */ >> + /* Handle an IO access or exclusive access. */ >> if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) { >> - CPUIOTLBEntry *iotlbentry; >> - if ((addr & (DATA_SIZE - 1)) != 0) { >> - goto do_unaligned_access; >> + CPUIOTLBEntry *iotlbentry = &env->iotlb[mmu_idx][index]; >> + >> + if ((tlb_addr & ~TARGET_PAGE_MASK) == TLB_EXCL) { >> + /* The slow-path has been forced since we are writing to >> + * exclusive-protected memory. */ >> + hwaddr hw_addr = (iotlbentry->addr & TARGET_PAGE_MASK) + >> addr; >> + >> + /* The function lookup_and_reset_cpus_ll_addr could have >> reset the >> + * exclusive address. Fail the SC in this case. >> + * N.B.: Here excl_succeeded == 0 means that >> helper_le_st_name has >> + * not been called by a softmmu_llsc_template.h. */ >> + if(env->excl_succeeded) { >> + if (env->excl_protected_range.begin != hw_addr) { >> + /* The vCPU is SC-ing to an unprotected address. */ >> + env->excl_protected_range.begin = >> EXCLUSIVE_RESET_ADDR; >> + env->excl_succeeded = 0; >> + >> + return; > > > Huh? How does a normal store ever fail. Surely you aren't using the normal > store path to support true store_conditional? As written in the comment, we are not doing a normal store, but the store for a SC operation. > >> + } >> + >> + cpu_physical_memory_set_excl_dirty(hw_addr, >> ENV_GET_CPU(env)->cpu_index); >> + } >> + >> + haddr = addr + env->tlb_table[mmu_idx][index].addend; >> + #if DATA_SIZE == 1 >> + glue(glue(st, SUFFIX), _p)((uint8_t *)haddr, val); >> + #else >> + glue(glue(st, SUFFIX), _le_p)((uint8_t *)haddr, val); >> + #endif > > > You're assuming that TLB_EXCL can never have any other TLB_ bits set. We Yes, I was assuming this... > definitely have to support TLB_EXCL+TLB_NOTDIRTY, and probably +TLB_MMIO too > (iirc that's how watchpoints are implemeneted). ...but indeed I was wrong, we need to support also these combinations. I will address this points for the next version. Thank you, alvise > > > r~