On Sat, Aug 04, 2018 at 05:14:31AM -0700, John Baldwin wrote: > On 8/4/18 1:08 AM, Konstantin Belousov wrote: > > On Fri, Aug 03, 2018 at 08:04:06PM +0000, Justin Hibbits wrote: > >> Author: jhibbits > >> Date: Fri Aug 3 20:04:06 2018 > >> New Revision: 337273 > >> URL: https://svnweb.freebsd.org/changeset/base/337273 > >> > >> Log: > >> nvme(4): Add bus_dmamap_sync() at the end of the request path > >> > >> Summary: > >> Some architectures, in this case powerpc64, need explicit synchronization > >> barriers vs device accesses. > >> > >> Prior to this change, when running 'make buildworld -j72' on a 18-core > >> (72-thread) POWER9, I would see controller resets often. With this > >> change, I > >> don't see these resets messages, though another tester still does, for > >> yet to be > >> determined reasons, so this may not be a complete fix. Additionally, I > >> see a > >> ~5-10% speed up in buildworld times, likely due to not needing to reset > >> the > >> controller. > >> > >> Reviewed By: jimharris > >> Differential Revision: https://reviews.freebsd.org/D16570 > >> > >> Modified: > >> head/sys/dev/nvme/nvme_qpair.c > >> > >> Modified: head/sys/dev/nvme/nvme_qpair.c > >> ============================================================================== > >> --- head/sys/dev/nvme/nvme_qpair.c Fri Aug 3 19:24:04 2018 > >> (r337272) > >> +++ head/sys/dev/nvme/nvme_qpair.c Fri Aug 3 20:04:06 2018 > >> (r337273) > >> @@ -401,9 +401,13 @@ nvme_qpair_complete_tracker(struct nvme_qpair *qpair, > >> req->retries++; > >> nvme_qpair_submit_tracker(qpair, tr); > >> } else { > >> - if (req->type != NVME_REQUEST_NULL) > >> + if (req->type != NVME_REQUEST_NULL) { > >> + bus_dmamap_sync(qpair->dma_tag_payload, > >> + tr->payload_dma_map, > >> + BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); > >> bus_dmamap_unload(qpair->dma_tag_payload, > >> tr->payload_dma_map); > >> + } > >> > >> nvme_free_request(req); > >> tr->req = NULL; > >> @@ -487,6 +491,8 @@ nvme_qpair_process_completions(struct nvme_qpair *qpai > >> */ > >> return (false); > >> > >> + bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map, > >> + BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE); > >> while (1) { > >> cpl = qpair->cpl[qpair->cq_head]; > >> > >> @@ -828,7 +834,16 @@ nvme_qpair_submit_tracker(struct nvme_qpair *qpair, st > >> if (++qpair->sq_tail == qpair->num_entries) > >> qpair->sq_tail = 0; > >> > >> + bus_dmamap_sync(qpair->dma_tag, qpair->queuemem_map, > >> + BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); > >> +#ifndef __powerpc__ > >> + /* > >> + * powerpc's bus_dmamap_sync() already includes a heavyweight sync, but > >> + * no other archs do. > >> + */ > >> wmb(); > >> +#endif > > What is the purpose of this call ? It is useless without paired read > > barrier. So where is the reciprocal rmb() ? > > For DMA, the rmb is in the device controller. However, architectures > that need this kind of ordering should do it in their bus_dmmap_sync op, > and this explicit one needs to be removed. (Alpha had a wmb in its > bus_dmamap_sync op for this reason.) Yes, if something special is needed, it should happen in platform-specific busdma code.
Also, if wmb() is needed, then it is not a supposed semantic or wmb(), but a specific side-effects of one of the instruction in the implementation of wmb(). As I noted, on x86 it is not needed and detrimental to the performance. _______________________________________________ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"