>> Yes, that's why I used 'significant'. One good thing is that given resources
>> it can easily be done in parallel with other development, and will give
>> additional
>> insight of some form.
>
>Yup, well if someone wants to start working on an emulated RDMA device
>that actually simulates prop
> My first reflex when reading this thread was to think that this whole domain
> lends it self excellently to testing via Qemu. Could it be that doing this in
> the opposite direction might be a safer approach in the long run even though
> (significant) more work up-front?
While the idea of QEM
On 25/04/17 12:30 AM, Knut Omang wrote:
> Yes, that's why I used 'significant'. One good thing is that given resources
> it can easily be done in parallel with other development, and will give
> additional
> insight of some form.
Yup, well if someone wants to start working on an emulated RDMA
On Mon, 2017-04-24 at 10:14 -0600, Logan Gunthorpe wrote:
>
> On 24/04/17 01:36 AM, Knut Omang wrote:
> > My first reflex when reading this thread was to think that this whole domain
> > lends it self excellently to testing via Qemu. Could it be that doing this
> > in
> > the opposite direction
On 24/04/17 01:36 AM, Knut Omang wrote:
> My first reflex when reading this thread was to think that this whole domain
> lends it self excellently to testing via Qemu. Could it be that doing this in
> the opposite direction might be a safer approach in the long run even though
> (significant) m
On Mon, 2017-04-17 at 08:31 +1000, Benjamin Herrenschmidt wrote:
> On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote:
> >
> > On 16/04/17 09:53 AM, Dan Williams wrote:
> > > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> > > context about the physical address in ques
On Thu, Apr 20, 2017 at 4:07 PM, Stephen Bates wrote:
>>> Yes, this makes sense I think we really just want to distinguish host
>>> memory or not in terms of the dev_pagemap type.
>>
>>> I would like to see mutually exclusive flags for host memory (or not) and
>>> persistence (or not).
>>>
>>
>>
>> Yes, this makes sense I think we really just want to distinguish host
>> memory or not in terms of the dev_pagemap type.
>
>> I would like to see mutually exclusive flags for host memory (or not) and
>> persistence (or not).
>>
>
> Why persistence? It has zero meaning to the mm.
I like the ide
On Thu, Apr 20, 2017 at 1:43 PM, Stephen Bates wrote:
>
>> Yes, this makes sense I think we really just want to distinguish host
>> memory or not in terms of the dev_pagemap type.
>
> I would like to see mutually exclusive flags for host memory (or not) and
> persistence (or not).
>
Why persist
> Yes, this makes sense I think we really just want to distinguish host
> memory or not in terms of the dev_pagemap type.
I would like to see mutually exclusive flags for host memory (or not) and
persistence (or not).
Stephen
On Wed, Apr 19, 2017 at 3:55 PM, Logan Gunthorpe wrote:
>
>
> On 19/04/17 02:48 PM, Jason Gunthorpe wrote:
>> On Wed, Apr 19, 2017 at 01:41:49PM -0600, Logan Gunthorpe wrote:
>>
But.. it could point to a GPU and the GPU struct device could have a
proxy dma_ops like Dan pointed out.
>>>
>
On 19/04/17 02:48 PM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2017 at 01:41:49PM -0600, Logan Gunthorpe wrote:
>
>>> But.. it could point to a GPU and the GPU struct device could have a
>>> proxy dma_ops like Dan pointed out.
>>
>> Seems a bit awkward to me that in order for the intended use ca
On Wed, Apr 19, 2017 at 01:41:49PM -0600, Logan Gunthorpe wrote:
> > But.. it could point to a GPU and the GPU struct device could have a
> > proxy dma_ops like Dan pointed out.
>
> Seems a bit awkward to me that in order for the intended use case, you
> have to proxy the dma_ops. I'd probably st
On 19/04/17 01:31 PM, Jason Gunthorpe wrote:
> Try it with VT-D turned on. It shouldn't work or there is a notable
> security hole in your platform..
Ah, ok.
>>> const struct dma_map_ops *comp_ops = get_dma_ops(completer);
>>> const struct dma_map_ops *init_ops = get_dma_ops(initiator
On Wed, Apr 19, 2017 at 01:02:49PM -0600, Logan Gunthorpe wrote:
>
>
> On 19/04/17 12:32 PM, Jason Gunthorpe wrote:
> > On Wed, Apr 19, 2017 at 12:01:39PM -0600, Logan Gunthorpe wrote:
> > Not entirely, it would have to call through the whole process
> > including the arch_p2p_cross_segment()..
>
On 19/04/17 12:32 PM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2017 at 12:01:39PM -0600, Logan Gunthorpe wrote:
> Not entirely, it would have to call through the whole process
> including the arch_p2p_cross_segment()..
Hmm, yes. Though it's still not clear what, if anything,
arch_p2p_cross_segme
On Wed, Apr 19, 2017 at 11:41 AM, Logan Gunthorpe wrote:
>
>
> On 19/04/17 12:30 PM, Dan Williams wrote:
>> Letting others users do the container_of() arrangement means that
>> struct page_map needs to become public and move into struct
>> dev_pagemap directly.
>
> Ah, yes, I got a bit turned arou
On 19/04/17 12:30 PM, Dan Williams wrote:
> Letting others users do the container_of() arrangement means that
> struct page_map needs to become public and move into struct
> dev_pagemap directly.
Ah, yes, I got a bit turned around by that and failed to notice that
page_map and dev_pagemap are di
On Wed, Apr 19, 2017 at 12:01:39PM -0600, Logan Gunthorpe wrote:
> I'm just spit balling here but if HMM wanted to use unaddressable memory
> as a DMA target, it could set that function to create a window ine gpu
> memory, then call the pci_p2p_same_segment and return the result as the
> dma addre
On Wed, Apr 19, 2017 at 11:19 AM, Logan Gunthorpe wrote:
>
>
> On 19/04/17 12:11 PM, Logan Gunthorpe wrote:
>>
>>
>> On 19/04/17 11:41 AM, Dan Williams wrote:
>>> No, not quite ;-). I still don't think we should require the non-HMM
>>> to pass NULL for all the HMM arguments. What I like about Loga
On 19/04/17 12:11 PM, Logan Gunthorpe wrote:
>
>
> On 19/04/17 11:41 AM, Dan Williams wrote:
>> No, not quite ;-). I still don't think we should require the non-HMM
>> to pass NULL for all the HMM arguments. What I like about Logan's
>> proposal is to have a separate create and register steps d
On 19/04/17 11:41 AM, Dan Williams wrote:
> No, not quite ;-). I still don't think we should require the non-HMM
> to pass NULL for all the HMM arguments. What I like about Logan's
> proposal is to have a separate create and register steps dev_pagemap.
> That way call paths that don't care about
On 19/04/17 11:14 AM, Jason Gunthorpe wrote:
> I don't see a use for the dma_map function pointer at this point..
Yes, it is kind of like designing for the future. I just find it a
little odd calling the pci functions in the iommu.
> It doesn't make alot of sense for the completor of the DMA t
On Wed, Apr 19, 2017 at 10:32 AM, Jerome Glisse wrote:
> On Wed, Apr 19, 2017 at 10:01:23AM -0700, Dan Williams wrote:
>> On Wed, Apr 19, 2017 at 9:48 AM, Logan Gunthorpe wrote:
>> >
>> >
>> > On 19/04/17 09:55 AM, Jason Gunthorpe wrote:
>> >> I was thinking only this one would be supported with
On Wed, Apr 19, 2017 at 10:01:23AM -0700, Dan Williams wrote:
> On Wed, Apr 19, 2017 at 9:48 AM, Logan Gunthorpe wrote:
> >
> >
> > On 19/04/17 09:55 AM, Jason Gunthorpe wrote:
> >> I was thinking only this one would be supported with a core code
> >> helper..
> >
> > Pivoting slightly: I was look
On Wed, Apr 19, 2017 at 10:48:51AM -0600, Logan Gunthorpe wrote:
> The pci_enable_p2p_bar function would then just need to call
> devm_memremap_pages with the dma_map callback set to a function that
> does the segment check and the offset calculation.
I don't see a use for the dma_map function poi
On Wed, Apr 19, 2017 at 9:48 AM, Logan Gunthorpe wrote:
>
>
> On 19/04/17 09:55 AM, Jason Gunthorpe wrote:
>> I was thinking only this one would be supported with a core code
>> helper..
>
> Pivoting slightly: I was looking at how HMM uses ZONE_DEVICE. They add a
> type flag to the dev_pagemap str
On 19/04/17 09:55 AM, Jason Gunthorpe wrote:
> I was thinking only this one would be supported with a core code
> helper..
Pivoting slightly: I was looking at how HMM uses ZONE_DEVICE. They add a
type flag to the dev_pagemap structure which would be very useful to us.
We could add another MEMORY
On Wed, Apr 19, 2017 at 11:20:06AM +1000, Benjamin Herrenschmidt wrote:
> That helper wouldn't perform the actual iommu mapping. It would simply
> return something along the lines of:
>
> - "use that alternate bus address and don't map in the iommu"
I was thinking only this one would be support
On Tue, 2017-04-18 at 16:24 -0600, Jason Gunthorpe wrote:
> Basically, all this list processing is a huge overhead compared to
> just putting a helper call in the existing sg iteration loop of the
> actual op. Particularly if the actual op is a no-op like no-mmu x86
> would use.
Yes, I'm leaning
On Tue, 2017-04-18 at 17:21 -0600, Jason Gunthorpe wrote:
> Splitting the sgl is different from iommu batching.
>
> As an example, an O_DIRECT write of 1 MB with a single 4K P2P page in
> the middle.
>
> The optimum behavior is to allocate a 1MB-4K iommu range and fill it
> with the CPU memory. T
On Tue, 2017-04-18 at 15:22 -0600, Jason Gunthorpe wrote:
> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
> > > I think this opens an even bigger can of worms..
> >
> > No, I don't think it does. You'd only shim when the target page is
> > backed by a device, not host memory, and y
On Tue, 2017-04-18 at 15:03 -0600, Jason Gunthorpe wrote:
> I don't follow, when does get_dma_ops() return a p2p aware provider?
> It has no way to know if the DMA is going to involve p2p, get_dma_ops
> is called with the device initiating the DMA.
>
> So you'd always return the P2P shim on a syst
On Tue, 2017-04-18 at 14:48 -0600, Logan Gunthorpe wrote:
> > ...and that dma_map goes through get_dma_ops(), so I don't see the conflict?
>
> The main conflict is in dma_map_sg which only does get_dma_ops once but
> the sg may contain memory of different types.
We can handle that in our "overrid
On Tue, 2017-04-18 at 12:00 -0600, Jason Gunthorpe wrote:
> - All platforms can succeed if the PCI devices are under the same
> 'segment', but where segments begin is somewhat platform specific
> knowledge. (this is 'same switch' idea Logan has talked about)
We also need to be careful whether
On Tue, 2017-04-18 at 10:27 -0700, Dan Williams wrote:
> > FWIW, RDMA probably wouldn't want to use a p2mem device either, we
> > already have APIs that map BAR memory to user space, and would like to
> > keep using them. A 'enable P2P for bar' helper function sounds better
> > to me.
>
> ...and I
On Tue, Apr 18, 2017 at 03:51:27PM -0700, Dan Williams wrote:
> > This really seems like much less trouble than trying to wrapper all
> > the arch's dma ops, and doesn't have the wonky restrictions.
>
> I don't think the root bus iommu drivers have any business knowing or
> caring about dma happen
On 18/04/17 04:24 PM, Jason Gunthorpe wrote:
> Try and write a stacked map_sg function like you describe and you will
> see how horrible it quickly becomes.
Yes, unfortunately, I have to agree with this statement completely.
> Since dma mapping is a performance path we must be careful not to
>
On Tue, Apr 18, 2017 at 3:56 PM, Logan Gunthorpe wrote:
>
>
> On 18/04/17 04:50 PM, Dan Williams wrote:
>> On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe wrote:
>>>
>>>
>>> On 18/04/17 04:28 PM, Dan Williams wrote:
Unlike the pci bus address offset case which I think is fundamental to
On 18/04/17 04:50 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe wrote:
>>
>>
>> On 18/04/17 04:28 PM, Dan Williams wrote:
>>> Unlike the pci bus address offset case which I think is fundamental to
>>> support since shipping archs do this today, I think it is ok to sa
On Tue, Apr 18, 2017 at 3:46 PM, Benjamin Herrenschmidt
wrote:
> On Tue, 2017-04-18 at 10:27 -0700, Dan Williams wrote:
>> > FWIW, RDMA probably wouldn't want to use a p2mem device either, we
>> > already have APIs that map BAR memory to user space, and would like to
>> > keep using them. A 'enabl
On Tue, Apr 18, 2017 at 3:42 PM, Jason Gunthorpe
wrote:
> On Tue, Apr 18, 2017 at 03:28:17PM -0700, Dan Williams wrote:
>
>> Unlike the pci bus address offset case which I think is fundamental to
>> support since shipping archs do this toda
>
> But we can support this by modifying those arch's uni
On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe wrote:
>
>
> On 18/04/17 04:28 PM, Dan Williams wrote:
>> Unlike the pci bus address offset case which I think is fundamental to
>> support since shipping archs do this today, I think it is ok to say
>> p2p is restricted to a single sgl that gets to
On 18/04/17 04:28 PM, Dan Williams wrote:
> Unlike the pci bus address offset case which I think is fundamental to
> support since shipping archs do this today, I think it is ok to say
> p2p is restricted to a single sgl that gets to talk to host memory or
> a single device. That said, what's wro
On Tue, Apr 18, 2017 at 03:28:17PM -0700, Dan Williams wrote:
> Unlike the pci bus address offset case which I think is fundamental to
> support since shipping archs do this toda
But we can support this by modifying those arch's unique dma_ops
directly.
Eg as I explained, my p2p_same_segment_map
On Tue, Apr 18, 2017 at 3:15 PM, Logan Gunthorpe wrote:
>
>
> On 18/04/17 03:36 PM, Dan Williams wrote:
>> On Tue, Apr 18, 2017 at 2:22 PM, Jason Gunthorpe
>> wrote:
>>> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
> I think this opens an even bigger can of worms..
On Tue, Apr 18, 2017 at 03:31:58PM -0600, Logan Gunthorpe wrote:
> 1) It means that sg_has_p2p has to walk the entire sg and check every
> page. Then map_sg_p2p/map_sg has to walk it again and repeat the check
> then do some operation per page. If anyone is concerned about the
> dma_map performanc
On 18/04/17 03:36 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 2:22 PM, Jason Gunthorpe
> wrote:
>> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
I think this opens an even bigger can of worms..
>>>
>>> No, I don't think it does. You'd only shim when the target page is
On Tue, Apr 18, 2017 at 2:22 PM, Jason Gunthorpe
wrote:
> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
>> > I think this opens an even bigger can of worms..
>>
>> No, I don't think it does. You'd only shim when the target page is
>> backed by a device, not host memory, and you can
On 18/04/17 03:03 PM, Jason Gunthorpe wrote:
> What about something more incremental like this instead:
> - dma_ops will set map_sg_p2p == map_sg when they are updated to
> support p2p, otherwise DMA on P2P pages will fail for those ops.
> - When all ops support p2p we remove the if and ops->ma
On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
> > I think this opens an even bigger can of worms..
>
> No, I don't think it does. You'd only shim when the target page is
> backed by a device, not host memory, and you can figure this out by a
> is_zone_device_page()-style lookup.
T
On Tue, Apr 18, 2017 at 2:03 PM, Jason Gunthorpe
wrote:
> On Tue, Apr 18, 2017 at 12:48:35PM -0700, Dan Williams wrote:
>
>> > Yes, I noticed this problem too and that makes sense. It just means
>> > every dma_ops will probably need to be modified to either support p2p
>> > pages or fail on them.
On Tue, Apr 18, 2017 at 12:48:35PM -0700, Dan Williams wrote:
> > Yes, I noticed this problem too and that makes sense. It just means
> > every dma_ops will probably need to be modified to either support p2p
> > pages or fail on them. Though, the only real difficulty there is that it
> > will be a
On 18/04/17 02:31 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 1:29 PM, Jerome Glisse wrote:
>>> On Tue, Apr 18, 2017 at 12:35 PM, Logan Gunthorpe
>>> wrote:
On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
> Ultimately every dma_ops will need special code to support P2P wit
On Tue, Apr 18, 2017 at 1:29 PM, Jerome Glisse wrote:
>> On Tue, Apr 18, 2017 at 12:35 PM, Logan Gunthorpe
>> wrote:
>> >
>> >
>> > On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
>> >> Ultimately every dma_ops will need special code to support P2P with
>> >> the special hardware that ops is control
> On Tue, Apr 18, 2017 at 12:35 PM, Logan Gunthorpe
> wrote:
> >
> >
> > On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
> >> Ultimately every dma_ops will need special code to support P2P with
> >> the special hardware that ops is controlling, so it makes some sense
> >> to start by pushing the chec
On 18/04/17 01:48 PM, Jason Gunthorpe wrote:
> I think this is why progress on this keeps getting stuck - every
> solution is a lot of work.
Yup! There's also a ton of work just to get the iomem safety issues
addressed. Let alone the dma mapping issues.
> You could try to do a dummy mapping / c
On Tue, Apr 18, 2017 at 01:35:32PM -0600, Logan Gunthorpe wrote:
> > Ultimately every dma_ops will need special code to support P2P with
> > the special hardware that ops is controlling, so it makes some sense
> > to start by pushing the check down there in the first place. This
> > advice is part
On Tue, Apr 18, 2017 at 12:35 PM, Logan Gunthorpe wrote:
>
>
> On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
>> Ultimately every dma_ops will need special code to support P2P with
>> the special hardware that ops is controlling, so it makes some sense
>> to start by pushing the check down there in
On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
> Ultimately every dma_ops will need special code to support P2P with
> the special hardware that ops is controlling, so it makes some sense
> to start by pushing the check down there in the first place. This
> advice is partially motivated by how dma_
On Tue, Apr 18, 2017 at 12:30:59PM -0600, Logan Gunthorpe wrote:
> > - The dma ops provider must be able to tell if source memory is bar
> >mapped and recover the pci device backing the mapping.
>
> Do you mean to say that every dma-ops provider needs to be taught about
> p2p backed pages? I
On Tue, Apr 18, 2017 at 11:00 AM, Jason Gunthorpe
wrote:
> On Tue, Apr 18, 2017 at 10:27:47AM -0700, Dan Williams wrote:
>> > FWIW, RDMA probably wouldn't want to use a p2mem device either, we
>> > already have APIs that map BAR memory to user space, and would like to
>> > keep using them. A 'enab
On 18/04/17 10:45 AM, Jason Gunthorpe wrote:
> From Ben's comments, I would think that the 'first class' support that
> is needed here is simply a function to return the 'struct device'
> backing a CPU address range.
Yes, and Dan's get_dev_pagemap suggestion gets us 90% of the way there.
It's ju
On Tue, Apr 18, 2017 at 10:27:47AM -0700, Dan Williams wrote:
> > FWIW, RDMA probably wouldn't want to use a p2mem device either, we
> > already have APIs that map BAR memory to user space, and would like to
> > keep using them. A 'enable P2P for bar' helper function sounds better
> > to me.
>
> .
On Tue, Apr 18, 2017 at 9:45 AM, Jason Gunthorpe
wrote:
> On Mon, Apr 17, 2017 at 08:23:16AM +1000, Benjamin Herrenschmidt wrote:
>
>> Thanks :-) There's a reason why I'm insisting on this. We have constant
>> requests for this today. We have hacks in the GPU drivers to do it for
>> GPUs behind a
On Mon, Apr 17, 2017 at 08:23:16AM +1000, Benjamin Herrenschmidt wrote:
> Thanks :-) There's a reason why I'm insisting on this. We have constant
> requests for this today. We have hacks in the GPU drivers to do it for
> GPUs behind a switch, but those are just that, ad-hoc hacks in the
> drivers
On Mon, 2017-04-17 at 23:43 -0600, Logan Gunthorpe wrote:
>
> On 17/04/17 03:11 PM, Benjamin Herrenschmidt wrote:
> > Is it ? Again, you create a "concept" the user may have no idea about,
> > "p2pmem memory". So now any kind of memory buffer on a device can could
> > be use for p2p but also poten
On 17/04/17 12:04 PM, Jerome Glisse wrote:
> I disagree here. I would rather see Peer-to-Peer mapping as a form
> of helper so that device driver can opt-in for multiple mecanisms
> concurrently. Like HMM and p2p.
I'm not against moving some of the common stuff into a library. It
sounds like the
On 17/04/17 03:11 PM, Benjamin Herrenschmidt wrote:
> Is it ? Again, you create a "concept" the user may have no idea about,
> "p2pmem memory". So now any kind of memory buffer on a device can could
> be use for p2p but also potentially a bunch of other things becomes
> special and called "p2pmem
On 17/04/17 11:04 AM, Dan Williams wrote:
>> Yes, in this scheme, it needs an additional p2pmem child. Why is that an
>> issue? It certainly makes it a lot easier for the user to understand the
>> p2pmem memory in the system (through the sysfs tree) and reason about
>> the topology and when to us
On Mon, 2017-04-17 at 10:52 -0600, Logan Gunthorpe wrote:
>
> On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote:
> > But is it ? For example take a GPU, does it, in your scheme, need an
> > additional "p2pmem" child ? Why can't the GPU driver just use some
> > helper to instantiate the necessary
On Mon, Apr 17, 2017 at 10:52:29AM -0600, Logan Gunthorpe wrote:
>
>
> On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote:
> > But is it ? For example take a GPU, does it, in your scheme, need an
> > additional "p2pmem" child ? Why can't the GPU driver just use some
> > helper to instantiate the
On Mon, Apr 17, 2017 at 9:52 AM, Logan Gunthorpe wrote:
>
>
> On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote:
>> But is it ? For example take a GPU, does it, in your scheme, need an
>> additional "p2pmem" child ? Why can't the GPU driver just use some
>> helper to instantiate the necessary str
On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote:
> But is it ? For example take a GPU, does it, in your scheme, need an
> additional "p2pmem" child ? Why can't the GPU driver just use some
> helper to instantiate the necessary struct pages ? What does having an
> actual "struct device" child b
On Sun, 2017-04-16 at 23:13 -0600, Logan Gunthorpe wrote:
>
> >
> > I'm still not 100% why do you need a "p2mem device" mind you ...
>
> Well, you don't "need" it but it is a design choice that I think makes a
> lot of sense for the following reasons:
>
> 1) p2pmem is in fact a device on the pc
On 16/04/17 04:32 PM, Benjamin Herrenschmidt wrote:
>> I'll consider this. Given the fact I can use your existing
>> get_dev_pagemap infrastructure to look up the p2pmem device this
>> probably isn't as hard as I thought it would be anyway (we probably
>> don't even need a page flag). We'd just h
On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote:
>
> On 16/04/17 09:53 AM, Dan Williams wrote:
> > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> > context about the physical address in question. I'm thinking you can
> > hang bus address translation data off of tha
On Sun, 2017-04-16 at 10:47 -0600, Logan Gunthorpe wrote:
> > I think you need to give other archs a chance to support this with a
> > design that considers the offset case as a first class citizen rather
> > than an afterthought.
>
> I'll consider this. Given the fact I can use your existing
> ge
On Sun, 2017-04-16 at 08:53 -0700, Dan Williams wrote:
> > Just thinking out loud ... I don't have a firm idea or a design. But
> > peer to peer is definitely a problem we need to tackle generically, the
> > demand for it keeps coming up.
>
> ZONE_DEVICE allows you to redirect via get_dev_pagemap(
On Sun, 2017-04-16 at 08:44 -0700, Dan Williams wrote:
> The difference is that there was nothing fundamental in the core
> design of pmem + DAX that prevented other archs from growing pmem
> support.
Indeed. In fact we have work in progress support for pmem on power
using experimental HW.
> THP
On 16/04/17 09:44 AM, Dan Williams wrote:
> I think we very much want the dma mapping layer to be in the way.
> It's the only sane semantic we have to communicate this translation.
Yes, I wasn't proposing bypassing that layer, per say. I just meant that
the layer would, in the end, have to retur
On 16/04/17 09:53 AM, Dan Williams wrote:
> ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> context about the physical address in question. I'm thinking you can
> hang bus address translation data off of that structure. This seems
> vaguely similar to what HMM is doing.
Th
On Sat, Apr 15, 2017 at 8:01 PM, Benjamin Herrenschmidt
wrote:
> On Sat, 2017-04-15 at 15:09 -0700, Dan Williams wrote:
>> I'm wondering, since this is limited to support behind a single
>> switch, if you could have a software-iommu hanging off that switch
>> device object that knows how to catch
On Sat, Apr 15, 2017 at 10:36 PM, Logan Gunthorpe wrote:
>
>
> On 15/04/17 04:17 PM, Benjamin Herrenschmidt wrote:
>> You can't. If the iommu is on, everything is remapped. Or do you mean
>> to have dma_map_* not do a remapping ?
>
> Well, yes, you'd have to change the code so that iomem pages do
On 15/04/17 04:17 PM, Benjamin Herrenschmidt wrote:
> You can't. If the iommu is on, everything is remapped. Or do you mean
> to have dma_map_* not do a remapping ?
Well, yes, you'd have to change the code so that iomem pages do not get
remapped and the raw BAR address is passed to the DMA engin
On 15/04/17 09:01 PM, Benjamin Herrenschmidt wrote:
> Are ZONE_DEVICE pages identifiable based on the struct page alone ? (a
> flag ?)
Well you can't use ZONE_DEVICE as an indicator. They may be regular RAM,
(eg. pmem). It would need a separate flag indicating it is backed by iomem.
Logan
On Sat, 2017-04-15 at 15:09 -0700, Dan Williams wrote:
> I'm wondering, since this is limited to support behind a single
> switch, if you could have a software-iommu hanging off that switch
> device object that knows how to catch and translate the non-zero
> offset bus address case. We have somethi
On Sat, 2017-04-15 at 11:41 -0600, Logan Gunthorpe wrote:
> Thanks, Benjamin, for the summary of some of the issues.
>
> On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
> > So I assume the p2p code provides a way to address that too via special
> > dma_ops ? Or wrappers ?
>
> Not at this time.
On Sat, Apr 15, 2017 at 10:41 AM, Logan Gunthorpe wrote:
> Thanks, Benjamin, for the summary of some of the issues.
>
> On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
>> So I assume the p2p code provides a way to address that too via special
>> dma_ops ? Or wrappers ?
>
> Not at this time. We
Thanks, Benjamin, for the summary of some of the issues.
On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
> So I assume the p2p code provides a way to address that too via special
> dma_ops ? Or wrappers ?
Not at this time. We will probably need a way to ensure the iommus do
not attempt to rema
On Fri, 2017-04-14 at 14:04 -0500, Bjorn Helgaas wrote:
> I'm a little hesitant about excluding offset support, so I'd like to
> hear more about this.
>
> Is the issue related to PCI BARs that are not completely addressable
> by the CPU? If so, that sounds like a first-class issue that should
> b
On Fri, Apr 14, 2017 at 11:30:14AM -0600, Logan Gunthorpe wrote:
> On 14/04/17 05:37 AM, Benjamin Herrenschmidt wrote:
> > I object to designing a subsystem that by design cannot work on whole
> > categories of architectures out there.
>
> Hardly. That's extreme. We'd design a subsystem that works
On 14/04/17 05:37 AM, Benjamin Herrenschmidt wrote:
> I object to designing a subsystem that by design cannot work on whole
> categories of architectures out there.
Hardly. That's extreme. We'd design a subsystem that works for the easy
cases and needs more work to support the offset cases. It w
On Thu, 2017-04-13 at 22:40 -0600, Logan Gunthorpe wrote:
>
> On 13/04/17 10:16 PM, Jason Gunthorpe wrote:
> > I'd suggest just detecting if there is any translation in bus
> > addresses anywhere and just hard disabling P2P on such systems.
>
> That's a fantastic suggestion. It simplifies things
On Fri, 2017-04-14 at 21:37 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2017-04-13 at 22:40 -0600, Logan Gunthorpe wrote:
> >
> > On 13/04/17 10:16 PM, Jason Gunthorpe wrote:
> > > I'd suggest just detecting if there is any translation in bus
> > > addresses anywhere and just hard disabling P2P
On Thu, 2017-04-13 at 22:16 -0600, Jason Gunthorpe wrote:
> > Any caller of pci_add_resource_offset() uses CPU addresses different from
> > the PCI bus addresses (unless the offset is zero, of course). All ACPI
> > platforms also support this translation (see "translation_offset"), though
> > in m
On 13/04/17 10:16 PM, Jason Gunthorpe wrote:
> I'd suggest just detecting if there is any translation in bus
> addresses anywhere and just hard disabling P2P on such systems.
That's a fantastic suggestion. It simplifies things significantly.
Unless there are any significant objections I think I
On Thu, Apr 13, 2017 at 06:26:31PM -0500, Bjorn Helgaas wrote:
> > Ah, thanks for the tip! On my system, this translation returns the same
> > address so it was not necessary. And, yes, that means this would have to
> > find its way into the dma mapping routine somehow. This means we'll
> > eventua
On Thu, Apr 13, 2017 at 03:22:06PM -0600, Logan Gunthorpe wrote:
>
>
> On 12/04/17 03:55 PM, Benjamin Herrenschmidt wrote:
> > Look at pcibios_resource_to_bus() and pcibios_bus_to_resource(). They
> > will perform the conversion between the struct resource content (CPU
> > physical address) and t
On Thu, 2017-04-13 at 15:22 -0600, Logan Gunthorpe wrote:
>
> On 12/04/17 03:55 PM, Benjamin Herrenschmidt wrote:
> > Look at pcibios_resource_to_bus() and pcibios_bus_to_resource(). They
> > will perform the conversion between the struct resource content (CPU
> > physical address) and the actual
1 - 100 of 133 matches
Mail list logo