On Tue, 24 Sep 2019 at 18:22, Richard Henderson <richard.hender...@linaro.org> wrote: > > On 9/10/19 2:56 AM, Beata Michalska wrote: > > @@ -2229,7 +2229,8 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t > > cpregid) > > #define ARM_CP_NZCV (ARM_CP_SPECIAL | 0x0300) > > #define ARM_CP_CURRENTEL (ARM_CP_SPECIAL | 0x0400) > > #define ARM_CP_DC_ZVA (ARM_CP_SPECIAL | 0x0500) > > -#define ARM_LAST_SPECIAL ARM_CP_DC_ZVA > > +#define ARM_CP_DC_CVAP (ARM_CP_SPECIAL | 0x0600) > > +#define ARM_LAST_SPECIAL ARM_CP_DC_CVAP > > I don't see that this operation needs to be handled via "special". It's a > function call upon write, as for many other system registers. >
Too inspired by ZVA I guess. Will make the appropriate changes in the next version. > > +static inline bool isar_feature_aa64_dcpop(const ARMISARegisters *id) > > +{ > > + return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) != 0; > > +} > > + > > +static inline bool isar_feature_aa64_dcpodp(const ARMISARegisters *id) > > +{ > > + return (FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, DPB) >> 1) != 0; > > +} > > The correct check is FIELD_EX(...) >= 2. This is a 4-bit field, as with all > of > the others. > Noted. > > +static CPAccessResult aa64_cacheop_persist_access(CPUARMState *env, > > + const ARMCPRegInfo *ri, > > + bool isread) > > +{ > > + ARMCPU *cpu = env_archcpu(env); > > + /* > > + * Access is UNDEF if lacking implementation for either DC CVAP or DC > > CVADP > > + * DC CVAP -> CRm: 0xc > > + * DC CVADP -> CRm: 0xd > > + */ > > + return (ri->crm == 0xc && !cpu_isar_feature(aa64_dcpop, cpu)) || > > + (ri->crm == 0xd && !cpu_isar_feature(aa64_dcpodp, cpu)) > > + ? CP_ACCESS_TRAP_UNCATEGORIZED > > + : aa64_cacheop_access(env, ri, isread); > > +} > ... > > + { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64, > > + .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1, > > + .access = PL0_W, .type = ARM_CP_DC_CVAP, > > + .accessfn = aa64_cacheop_persist_access }, > > + { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64, > > + .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1, > > + .access = PL0_W, .type = ARM_CP_DC_CVAP, > > + .accessfn = aa64_cacheop_persist_access }, > > While this access function works, it's better to simply not register these at > all when they're not supported. Compare the registration of rndr_reginfo. > > As I described above, I think this can use a normal write function. In which > case this would use .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END. > > (I believe that ARM_CP_IO is not required, but I'm not 100% sure. Certainly > there does not seem to be anything in dc_cvap() that affects state which can > queried from another virtual cpu, so there does not appear to be any reason to > grab the global i/o lock. The host kernel should be just fine with concurrent > msync syscalls, or whatever it is that libpmem uses.) > > OK, will move that to conditional registration and double check the type. Thanks for the suggestion. > > +void HELPER(dc_cvap)(CPUARMState *env, uint64_t vaddr_in) > > +{ > > +#ifndef CONFIG_USER_ONLY > > + ARMCPU *cpu = env_archcpu(env); > > + /* CTR_EL0 System register -> DminLine, bits [19:16] */ > > + uint64_t dline_size = 4 << ((cpu->ctr >> 16) & 0xF); > > + uint64_t vaddr = vaddr_in & ~(dline_size - 1); > > + void *haddr; > > + int mem_idx = cpu_mmu_index(env, false); > > + > > + /* This won't be crossing page boundaries */ > > + haddr = probe_read(env, vaddr, dline_size, mem_idx, GETPC()); > > + if (haddr) { > > + > > + ram_addr_t offset; > > + MemoryRegion *mr; > > + > > + /* > > + * RCU critical section + ref counting, > > + * so that MR won't disappear behind the scene > > + */ > > + rcu_read_lock(); > > + mr = memory_region_from_host(haddr, &offset); > > + if (mr) { > > + memory_region_ref(mr); > > + } > > + rcu_read_unlock(); > > + > > + if (mr) { > > + memory_region_do_writeback(mr, offset, dline_size); > > + memory_region_unref(mr); > > + } > > + } > > +#endif > > > We hold the rcu lock whenever a TB is executing. I don't believe there's any > point in increasing the lock count here. Similarly with memory_region > refcounts -- they cannot vanish while we're executing a TB. > > Thus I believe that about half of this function can fold away. > > So I was chasing the wrong locking herre... Indeed if the RCU lock is being already held I can safely drop the locking here. > r~ Thank you for the review, BR Beata