On Tue, Jun 30, 2020 at 8:09 PM Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> wrote: > > On 7/1/20 1:15 AM, Dan Williams wrote: > > On Tue, Jun 30, 2020 at 2:21 AM Aneesh Kumar K.V > > <aneesh.ku...@linux.ibm.com> wrote: > > [..] > >>>> The bio argument isn't for range based flushing, it is for flush > >>>> operations that need to complete asynchronously. > >>> How does the block layer determine that the pmem device needs > >>> asynchronous fushing? > >>> > >> > >> set_bit(ND_REGION_ASYNC, &ndr_desc.flags); > >> > >> and dax_synchronous(dev) > > > > Yes, but I think it is overkill to have an indirect function call just > > for a single instruction. > > > > How about something like this instead, to share a common pmem_wmb() > > across x86 and powerpc. > > > > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c > > index 20ff30c2ab93..b14009060c83 100644 > > --- a/drivers/nvdimm/region_devs.c > > +++ b/drivers/nvdimm/region_devs.c > > @@ -1180,6 +1180,13 @@ int nvdimm_flush(struct nd_region *nd_region, > > struct bio *bio) > > { > > int rc = 0; > > > > + /* > > + * pmem_wmb() is needed to 'sfence' all previous writes such > > + * that they are architecturally visible for the platform buffer > > + * flush. > > + */ > > + pmem_wmb(); > > + > > if (!nd_region->flush) > > rc = generic_nvdimm_flush(nd_region); > > else { > > @@ -1206,17 +1213,14 @@ int generic_nvdimm_flush(struct nd_region > > *nd_region) > > idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, > > 8)); > > > > /* > > - * The first wmb() is needed to 'sfence' all previous writes > > - * such that they are architecturally visible for the platform > > - * buffer flush. Note that we've already arranged for pmem > > - * writes to avoid the cache via memcpy_flushcache(). The final > > - * wmb() ensures ordering for the NVDIMM flush write. > > + * Note that we've already arranged for pmem writes to avoid the > > + * cache via memcpy_flushcache(). The final wmb() ensures > > + * ordering for the NVDIMM flush write. > > */ > > - wmb(); > > > The series already convert this to pmem_wmb(). > > > for (i = 0; i < nd_region->ndr_mappings; i++) > > if (ndrd_get_flush_wpq(ndrd, i, 0)) > > writeq(1, ndrd_get_flush_wpq(ndrd, i, idx)); > > - wmb(); > > + pmem_wmb(); > > > Should this be pmem_wmb()? This is ordering the above writeq() right?
Correct, this can just be wmb(). > > > > > return 0; > > } > > > > This still results in two pmem_wmb() on platforms that doesn't have > flush_wpq. I was trying to avoid that by adding a nd_region->flush call > back. How about skip or exit early out of generic_nvdimm_flush if ndrd->flush_wpq is NULL? That still saves an indirect branch at the cost of another conditional, but that should still be worth it.