Re: [perf-discuss] NUMA ptools and ISM segments

Jonathan Chew Tue, 04 Oct 2005 15:44:18 -0700

Marc Rocas wrote On 09/27/05 21:27,:
> 
> On 9/26/05, *jonathan chew* <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
> 
> 
>     There may be a slightly better way to allocate memory from the lgroup
>     containing the least significant physical memory using
>     lgrp_affinity_set(3LGRP), meminfo(2), and madvise(MADV_ACCESS_LWP).
> 
>     Unfortunately, none of these ways are guaranteed to always allocate
>     memory in the least significant 4 gigabytes of physical memory.  For
>     example, if the node with the least significant physical memory contains
>     more than 4 gigabytes of RAM, physical memory may come from this node,
>     but be above the least significant 4 gigabytes.  :-(
> 
> 
> Fortunately,  we control the HW our systems are deployed on and can
> enforce a 4GB RAM limit if required.


Ok.  That definitely makes things easier.


>     I spoke to one of our I/O guys to see whether there is a prescribed way
>     to allocate physical memory in the least significant 4 gig for DMA from
>     userland.  Solaris doesn't provide an existing way.  The philosophy is
>     that users shouldn't have to care and that the I/O buffer will be copied
>     to/from the least significant physical memory inside the kernel if the
>     device can only reach that far to DMA.  I think that you may be able to
>     write a driver that can allocate physical memory in the desired range
>     and allow access to it with mmap(2) or ioctl(2).
> 
> 
> We already have such a driver but have not found a way to force the use
> of 2M pages!  Is there  a new DDI interface to request large page size?

I'm not sure, but can try to find out.

What does the driver use to allocate the memory (eg. ddi_dma_mem_alloc(9F))?


>     I'd like to understand more about your situation to get a better idea of
>     the constraints and whether there is a viable solution or the need
>     for one.
> 
>     What is your device?  Can you afford to let the kernel copy the user's
>     buffer into another buffer which is below 4 gig to DMA to your device?
>     If not, would it make sense to write your own driver?
> 
> 
> Not really.  We buffer data up and need to have it DMA in real-time to
> our device which futher processes it and passes on the processed data to
> another machine. The window of time once we have committed to delivering
> the data is strictly enforced and failure to do so effectively shuts
> down the other system. Way back in SunOS 5.4, we went as far as writing
> a pseudo driver to SOFTLOCK memory as we found that it was not enough to
> mlock() memory since we took page faults on the corresponding TTEs. As I
> noted previously in the beginning of this thread, we use our own version
> of physio() that assumes properly wired down memory and thus differs
> from the stock version in that it does not bother with the locking logic
> at all.


Ok.  I see.


> By writing our own device driver, do you mean one to export 4GB PA range
> to user-land or our own segment driver?

I mean whether your driver can allocate the DMA buffer in the right
place and provide acccess to it through mmap(2) or ioctl(2).  It sounds
like you have a driver like that already, but want the buffer to be on
large pages in the user application.

Is that right?



Jonathan

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] NUMA ptools and ISM segments

Reply via email to