Terry J. Reedy added the comment:

Windows memory-maps multi-gigabyte files just fine as long as one uses the 
proper build (64-bit), which we provide.

Given that mmap produces a finite-length sequence object, as documented, 
slicing is working as it should. Slicing beyond the length  returns an empty 
sequence. The is no different from 'abc'[4:6]==''.

Running Python with finite memory has many memory-associated limitations. They 
are mostly undocumented as the exact details may depend on hardware, OS, 
implementation, version, and build. One practical limitation is that mmap with 
a 32-bit build cannot completely map multi-gigabyte files.

The current doc says:
"class mmap.mmap(fileno, length, tagname=None, access=ACCESS_DEFAULT[, offset]) 
(Windows version) Maps length bytes from the file specified by the file handle 
fileno, and creates a mmap object. If length is larger than the current size of 
the file, the file is extended to contain length bytes. If length is 0, the 
maximum length of the map is the current size of the file, except that if the 
file is empty Windows raises an exception (you cannot create an empty mapping 
on Windows)."

It does not say what happens if the requested length is larger than the max 
possible on a particular system. In particular, there is no mention of 
exception raising. So failure to raise is not a bug for tracker purposes.

The two possibilities of what to do is such situations are best effort and 
bailout. The current choice (at least on Windows, and whether by us, Microsoft, 
or the original mmap authors, I don't know) is best effort. I think that is 
fine, but should be documented. Users who care can compare the mmap object 
length with the file length or needed length and raise or do whatever if the 
mmap length is too short.

So I think we should change this to a doc issue and add something like "If the 
requested length is larger than the limit for the current system, then that 
limit is used as the length."
or
"The length of the returned mmap object has a limit that depends on the details 
of the running system."

Or the header should say that there is a system limit and two of the sentences 
above revised. In the first, change 'length bytes' to 'min(length, system 
limit) bytes. (I am presuming this is true also when length is not given as 0.) 
In the last sentence, change 'current size' to 'min(current size, system 
limit)'.

The Unix version doc should also clarify behavior.
---

If we were to change mmap() (but only in a future release), then users who want 
the current behavior would have to discover, hard-code, and explicitly but 
conditionally pass the limit for each system their code might ever run on. I do 
not know that that is sensibly possible. I would not be surprised if the limit 
for a given 32-bit build varies for different windows versions and setups.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue16743>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to