application needs fast access to physical memory

2010-11-17 Thread steven . lin


My application needs a fast way to access a specific physical DDR memory
region. The application runs on an MPC8548 PowerPC which has an MMU. I've
tried two approaches that are typical for Linux, mmap() and using a kernel
module that implements read()/write() into this region and I'm finding that
performance is very slow for both. It's a couple orders of magnitude slower
than, for example, copying a large buffer from one place in the
application's virtual memory to another place in the application's virtual
memory.

Steve Lin___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: application needs fast access to physical memory

2010-11-18 Thread steven . lin
Thanks for the replies.

In the Linux Device Drivers book regarding mmap(), it states:

   Mapping a device means associating a range of user-space addresses to
   device memory.
   Whenever the program reads or writes in the assigned address range, it
   is actually
   accessing the device. In the X server example, using mmap allows quick
   and easy
   access to the video card’s memory. For a performance-critical
   application like this,
   direct access makes a large difference.

For whatever reason, mmap() is definitely not quick and does not appear to
be a direct access to device memory. After the application completes a
large write into physical memory (via the pointer returned from mmap()),
the application performs an ioctl() to query whether the data actually
arrived into the memory region. It seems to take some time before the
associated kernel module actually "sees" the data in the physical memory
region.

There's a few things I should say about this memory region. There's a total
of 512 MB of physical memory. U-Boot passes "mem=256M" as a kernel
parameter to tell Linux to only directly manage the lower 256 MB. The
special region of physical memory that the application is trying to access
is the upper 256 MB of memory not directly managed by Linux. The mmap()
call from the application is:
   *memptr = (void *) mmap( NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED,
   _fdTerAlloc, (off_t) 0x1000);

On the kernel module side, the function handling the mmap() file operation
is:
   static int ter_alloc_mmap( struct file *pFile, struct vm_area_struct
   *vma )
   {
   if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end -
   vma->vm_start, vma->vm_page_prot))
   return -EAGAIN;

   vma->vm_ops = &ter_alloc_remap_vm_ops;
   ter_alloc_vma_open(vma);
   return 0;
   }

-Steve Lin




   
 David Gibson  
To 
   Michael Ellerman
 11/18/2010 06:54  
 AM cc 
   steven@teradyne.com,
   steven_...@notes.teradyne.com,  
   linuxppc-dev@lists.ozlabs.org   
   Subject 
   Re: application needs fast access   
   to physical memory  
   
   
   
   
   
   




On Thu, Nov 18, 2010 at 11:24:22PM +1100, Michael Ellerman wrote:
> On Wed, 2010-11-17 at 16:03 -0600, steven@teradyne.com wrote:
> > My application needs a fast way to access a specific physical DDR
> > memory region. The application runs on an MPC8548 PowerPC which has an
> > MMU. I've tried two approaches that are typical for Linux, mmap() and
> > using a kernel module that implements read()/write() into this region
> > and I'm finding that performance is very slow for both. It's a couple
> > orders of magnitude slower than, for example, copying a large buffer
> > from one place in the application's virtual memory to another place in
> > the application's virtual memory.
>
> The mmap() version should basically run at "full speed", at least once
> you've faulted the address range in.
>
> This specific DDR region isn't specifically slow is it ? :)

The other theory that springs to mind is whatever method you're using
to access the region enabling cacheing?

--
David Gibson | I'll have my music
baroque, and my code
david AT gibson.dropbear.id.au   | minimalist, thank you.  NOT
_the_ _other_
 | _way_ 
_around_!
http://www.ozlabs.org/~dgibson
[attachment "signature.asc" deleted by Steven Lin/USW/Teradyne]<><><>___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: application needs fast access to physical memory

2010-11-18 Thread steven . lin
Hello Scott,

Do you know whether this patch is necessary if I were to use alloc_bootmem
() (to set aside a region of contiguous physical memory) instead of the
kernel parameter "mem=256"?

-Steve Lin





   
 Scott Wood
To 
  
 11/18/2010 01:35   cc 
 PMDavid Gibson
   ,  
   Michael Ellerman
   ,   

   Subject 
   Re: application needs fast access   
   to physical memory  
   
   
   
   
   
   




On Thu, 18 Nov 2010 10:55:21 -0600
 wrote:

> Thanks for the replies.
>
> In the Linux Device Drivers book regarding mmap(), it states:
>
>Mapping a device means associating a range of user-space addresses to
>device memory.
>Whenever the program reads or writes in the assigned address range, it
>is actually
>accessing the device. In the X server example, using mmap allows quick
>and easy
>access to the video card’s memory. For a performance-critical
>application like this,
>direct access makes a large difference.
>
> For whatever reason, mmap() is definitely not quick and does not appear
to
> be a direct access to device memory. After the application completes a
> large write into physical memory (via the pointer returned from mmap()),
> the application performs an ioctl() to query whether the data actually
> arrived into the memory region. It seems to take some time before the
> associated kernel module actually "sees" the data in the physical memory
> region.
>
> There's a few things I should say about this memory region. There's a
total
> of 512 MB of physical memory. U-Boot passes "mem=256M" as a kernel
> parameter to tell Linux to only directly manage the lower 256 MB. The
> special region of physical memory that the application is trying to
access
> is the upper 256 MB of memory not directly managed by Linux. The mmap()
> call from the application is:
>*memptr = (void *) mmap( NULL, size, PROT_READ | PROT_WRITE,
MAP_SHARED,
>_fdTerAlloc, (off_t) 0x1000);

Try this patch:
http://patchwork.ozlabs.org/patch/68246/

-Scott

<><><>___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev