On Tue, Jun 25, 2019 at 01:54:21PM -0600, Logan Gunthorpe wrote:
> Well whether it's dma_addr_t, phys_addr_t, pfn_t the result isn't all
> that different. You still need roughly the same 'if' hooks for any
> backed memory that isn't in the linear mapping and you can't get a
> kernel mapping for directly.
> 
> It wouldn't be too hard to do a similar patch set that uses something
> like phys_addr_t instead and have a request and queue flag for support
> of non-mappable memory. But you'll end up with very similar 'if' hooks
> and we'd have to clean up all bio-using drivers that access the struct
> pages directly.

We'll need to clean that mess up anyway, and I've been chugging
along doing some of that.  A lot still assume no highmem, so we need
to convert them over to something that kmaps anyway.  If we get
the abstraction right that will actually help converting over to
a better reprsentation.

> Though, we'd also still have the problem of how to recognize when the
> address points to P2PDMA and needs to be translated to the bus offset.
> The map-first inversion was what helped here because the driver
> submitting the requests had all the information. Though it could be
> another request flag and indicating non-mappable memory could be a flag
> group like REQ_NOMERGE_FLAGS -- REQ_NOMAP_FLAGS.

The assumes the request all has the same memory, which is a simplifing
assuption.  My idea was that if had our new bio_vec like this:

struct bio_vec {
        phys_addr_t             paddr; // 64-bit on 64-bit systems
        unsigned long           len;
};

we have a hole behind len where we could store flag.  Preferably
optionally based on a P2P or other magic memory types config
option so that 32-bit systems with 32-bit phys_addr_t actually
benefit from the smaller and better packing structure.

> If you think any of the above ideas sound workable I'd be happy to try
> to code up another prototype.

Іt sounds workable.  To some of the first steps are cleanups independent
of how the bio_vec is eventually going to look like.  That is making
the DMA-API internals work on the phys_addr_t, which also unifies the
map_resource implementation with map_page.  I plan to do that relatively
soon.  The next is sorting out access to bios data by virtual address.
All these need nice kmapping helper that avoid too much open coding.
I was going to look into that next, mostly to kill the block layer
bounce buffering code.  Similar things will also be needed at the
scatterlist level I think.  After that we need to more audits of
how bv_page is still used.  something like a bv_phys() helper that
does "page_to_phys(bv->bv_page) + bv->bv_offset" might come in handy
for example.

Reply via email to