On 02/27/2015 06:54 AM, Steven D'Aprano wrote:
Dave Angel wrote:
On 02/27/2015 12:58 AM, Steven D'Aprano wrote:
Dave Angel wrote:
(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because "you can't fake what you ain't got.")
If I recall correctly, disk access is about 10000 times slower than RAM,
so virtual memory is *at least* that much slower than real memory.
It's so much more complicated than that, that I hardly know where to
start.
[snip technical details]
As interesting as they were, none of those details will make swap faster,
hence my comment that virtual memory is *at least* 10000 times slower than
RAM.
The term "virtual memory" is used for many aspects of the modern memory
architecture. But I presume you're using it in the sense of "running in
a swapfile" as opposed to running in physical RAM.
Yes, a page fault takes on the order of 10,000 times as long as an
access to a location in L1 cache. I suspect it's a lot smaller though
if the swapfile is on an SSD drive. The first byte is that slow.
But once the fault is resolved, the nearby bytes are in physical memory,
and some of them are in L3, L2, and L1. So you're not running in the
swapfile any more. And even when you run off the end of the page,
fetching the sequentially adjacent page from a hard disk is much faster.
And if the disk has well designed buffering, faster yet. The OS tries
pretty hard to keep the swapfile unfragmented.
The trick is to minimize the number of page faults, especially to random
locations. If you're getting lots of them, it's called thrashing.
There are tools to help with that. To minimize page faults on code,
linking with a good working-set-tuner can help, though I don't hear of
people bothering these days. To minimize page faults on data, choosing
one's algorithm carefully can help. For example, in scanning through a
typical matrix, row order might be adjacent locations, while column
order might be scattered.
Not really much different than reading a text file. If you can arrange
to process it a line at a time, rather than reading the whole file into
memory, you generally minimize your round-trips to disk. And if you
need to randomly access it, it's quite likely more efficient to memory
map it, in which case it temporarily becomes part of the swapfile system.
--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list