On Wed, Dec 28, 2011 at 11:10:41PM -0500, Geoff Steckel wrote:
> On 12/28/2011 09:08 PM, gwes wrote:
> > I hope someone can shed some light on this.
> >
> > I'm running 5.0-current on an AMD64 with 4GB of physical memory.
> >
> > Reading large chunks (64K or multiples) from /dev/rsd0c using
> > the AMD chipset SATA controller and a modern 1G drive:
> >
> > time dd if=/dev/rsd3c of=/dev/null bs=128k count=10000
> > 10000+0 records in
> > 10000+0 records out
> > 1310720000 bytes transferred in 11.348 secs (115498831 bytes/sec)
> > 0.0u 2.0s 0:11.34 17.9% 0+0k 0+1io 0pf+0w
> >
> > Profiling the kernel shows that copyout() is being called from
> > physio() via uvm_vsunlock_device() for every MAXPHYS byte chunk.

Am I correct if I understand you use mmap on a device?

> > On first inspection, physio calls uvm_vslock_device(...,&map)
> > which checks to see if all pages in the request satisfy
> > PADDR_IS_DMA_REACHABLE(). If so, it returns NULL in map.
> > After strategy() returns, map is sent to uvm_vsunlock_device,
> > which calls copyout() if map != NULL.
> >
> > There's a comment on uvm_vslock_device saying it always returns
> > something in *retp, but the code seems to indicate otherwise.
> > PADDR_IS_DMA_REACHABLE checks against dma_constraint, which
> > is 0..0xffffffff which should allow all memory<  4G to be used
> > for DMA.
> >
> > What have I missed?
> > I believe that the copyout() shouldn't happen.
> > I'm trying to run multiple 140MB/sec drives simultaneously and
> > the copyout() is a killer - it's eating more of the
> > system memory and CPU bandwidth than I'd like.
>
> Responding to myself - on further examination, if the system has
> only 2G, DMA is done to user pages as expected without copy{in,out}.
> Once past the evil 3G boundary, page physical addresses are
>  >= 0x100000000, and the code hardwires a limit several places.
> 
> The ahci (at least) driver chokes and hangs if fed a physical address
> over 4G.

What exactly should happen depends a bit on the way you access the
memory, to be more exact, when it gets allocated.
If the memory is a buffer
        void *buf = malloc(...);
then the memory is first allocated by userspace.

Because the kernel cannot guess from an allocation if it'd be used for
IO or not, it'll assume that it won't be IO (which is true for most
allocations). In that case, the page will be from high memory above 4GB.
Ofcourse, in that case you'd be using read/write, which will always do
copyin/copyout.

However, if you use mmap instead, the IO path will be the first to do
the allocation. In that case, there's certainly something to say for
having the page allocated in dma-reachable memory (although I would
worry about having that happen alot, since dma-reachable memory is
limited and our pager does not deal with the differences yet).

> I'd really like to help someone fix this or if given more clues on where
> to look, I'll dig some more. The work I'm trying to do will probably
> survive quite well in 2G.

I'd really like a look at the profiling data, can you mail that to me?
-- 
Ariane

Reply via email to