On Mon, May 14, 2007 at 01:47:35PM +0300, Shachar Shemesh wrote:

> PAE is but an extension to the virtual memory technique, but using
> unaddressable memory instead of the disk. The machine has 64GB of
> physical memory, but can only actually address 4GB at a time. Pages
> of physical memory are swapped in and out of the addressable
> PHYSICAL range by means of using the PAE, and then, using the MMU,
> into virtual space.  So each of the 64GB physical memory is given a
> 4GB physical address (not concurrently, of course), and then given a
> 4GB virtual address for the sake of the actual running processes.

Hmm? that doesn't sound correct. All PAE does it make it possible to
have 36-bits PFNs in the PTEs, so that your physical addressability is
up to 64GB. You *can* address all 64GB of physical memory "at the same
time". In other words PAE lets you map 4GB of virtual -> 64GB of
physical.

> Except we have a problem. Each time we need to switch between user
> space and kernel space, we need to have the kernel ready and
> available to us.  This must be the case so we can actually handle
> whatever it is that triggered the move (hardware interrupt, software
> interrupt or trap). The way we do that is by keeping the entire
> memory allocated to the kernel (code + data) mapped to the top area
> of the virtual memory addresses, no matter where we are in the
> system. Whether we are in kernel space, or each and every running
> user space process, we always keep the kernel at the same
> addresses. Of course, if we are in user space we mark the addresses
> as non-readable, non-writeable, but that's ok, because we can tell
> the MMU that a certain page is only read/writeable if the CPU is in
> Ring 0, and the CPU automatically enters ring 0 in case of an
> interrupt (of any kind). Problem solved.

This is misleading. We can have the kernel "available for us" just
fine even if is not mapped in the user's address space. The reason it
is mapped (on x86-32 only!) in every process's address space is to cut
down on context switch costs, since we aren't really switching address
spaces (which would necessitate a TLB flush).

> Now here's the problem. Sometimes, when there is too much memory in
> the machine (using PAE), it may turn out that 1GB is not enough to
> keep track of what virtual address for which process belongs to
> which physical address. Merely managing the physical memory requires
> an overhead, and with too much overhead, 1GB is not enough.

Again, this is misleading. It's only a problem with the way it's
implemented in *linux* on *x86-32*, using mem_map and allocating page
tables from low-mem (which we don't do any more if you have
CONFIG_HIGHPTE enabled). Alternative implementations are definitely
possible.

> There are two possible solutions to this problem. The first is to
> increase the amount of memory allocated to the kernel. We could, for
> example, switch from allocating 1GB to the kernel in a 3/1 split to
> allocating 2GB to the kernel in a 2/2 split (like Windows). This,
> however, leads to the following absurd: the more physical memory
> you, the less memory each user space program can use!

"The less *virtual* memory each user space program can use in a single
addres space" is what you meant to say. It's trivial to fork() and
thus get a second address space to play with. Additionally, you could
use something like shared page tables to solve (or at least mitigate)
the same problem.

> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> 
> int main(int argc, char *argv[] )
> {
>    void *address=(void *)0xc0000000; /* start of the top 1GB */
>    if( argc>1 ) {
>       /* Ask for a specific address */
>       address=(void *)strtoul(argv[1], NULL, 0);
>    }
> 
>    if( address==0 ) {
>       fprintf(stderr,
>          "Must specify legal address as parameter, or give no parameter at 
> all\n"
>          "Use 0x prefix for hexadeciaml addresses\n");
> 
>       return 1;
>    }
> 
>    printf("Trying to allocate 1 byte starting at address %p\n", address);
>    void *alloced=mmap( address, 1, PROT_READ|PROT_WRITE, 
> MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
>    if( alloced==MAP_FAILED ) {
>       perror("Failed to map memory");
> 
>       return 1;

MAP_FIXED will make this simpler.

I applaud your taking the time to write such a detailed
explanation.

Cheers,
Muli

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to