On 12/28/2011 09:08 PM, gwes wrote:
I hope someone can shed some light on this.
I'm running 5.0-current on an AMD64 with 4GB of physical memory.
Reading large chunks (64K or multiples) from /dev/rsd0c using
the AMD chipset SATA controller and a modern 1G drive:
time dd if=/dev/rsd3c of=/dev/null bs=128k count=10000
10000+0 records in
10000+0 records out
1310720000 bytes transferred in 11.348 secs (115498831 bytes/sec)
0.0u 2.0s 0:11.34 17.9% 0+0k 0+1io 0pf+0w
Profiling the kernel shows that copyout() is being called from
physio() via uvm_vsunlock_device() for every MAXPHYS byte chunk.
On first inspection, physio calls uvm_vslock_device(...,&map)
which checks to see if all pages in the request satisfy
PADDR_IS_DMA_REACHABLE(). If so, it returns NULL in map.
After strategy() returns, map is sent to uvm_vsunlock_device,
which calls copyout() if map != NULL.
There's a comment on uvm_vslock_device saying it always returns
something in *retp, but the code seems to indicate otherwise.
PADDR_IS_DMA_REACHABLE checks against dma_constraint, which
is 0..0xffffffff which should allow all memory< 4G to be used
for DMA.
What have I missed?
I believe that the copyout() shouldn't happen.
I'm trying to run multiple 140MB/sec drives simultaneously and
the copyout() is a killer - it's eating more of the
system memory and CPU bandwidth than I'd like.
thanks
Geoff Steckel
Responding to myself - on further examination, if the system has
only 2G, DMA is done to user pages as expected without copy{in,out}.
Once past the evil 3G boundary, page physical addresses are
>= 0x100000000, and the code hardwires a limit several places.
The ahci (at least) driver chokes and hangs if fed a physical address
over 4G.
I'd really like to help someone fix this or if given more clues on where
to look, I'll dig some more. The work I'm trying to do will probably
survive quite well in 2G.