[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Richard Oudkerk
Richard Oudkerk added the comment: > Isn't 2 GiB + 1 bytes mmap file enough for testing? Yes. But creating multigigabyte files is very slow on Windows. On Linux/FreeBSD test_mmap takes a fraction of a second, whereas on Windows it takes over 2 minutes. (Presumably Linux/FreeBSD is automatic

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: LGTM. Isn't 2 GiB + 1 bytes mmap file enough for testing? -- ___ Python tracker ___ ___ Python-bug

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Richard Oudkerk
Richard Oudkerk added the comment: New patch with same check for Unix. -- Added file: http://bugs.python.org/file28446/mmap.patch ___ Python tracker ___ _

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Agree. Please add the same check for Unix implementation (instead of unsafe overflow trick). -- ___ Python tracker ___ _

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Richard Oudkerk
Richard Oudkerk added the comment: > This change is not backward compatible. Now user can mmap a larger file > and safely access lower 2 GiB. With the patch it will fail. They should specify length=2GiB-1 if that is what they want. With length=0 you can only access the lower 2GiB if "file_size

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: This change is not backward compatible. Now user can mmap a larger file and safely access lower 2 GiB. With the patch it will fail. Unix implementation uses unsafe integer overflow idiom which cause undefined behavior (Mark, you have the floor). --

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-26 Thread Richard Oudkerk
Richard Oudkerk added the comment: On 32 bit Unix mmap() will raise ValueError("mmap length is too large") in Marc's example. This is correct since Python's sequence protocol does not support indexes larger than sys.maxsize. But on 32 bit Windows, if length == 0 then the size check always pas

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-24 Thread Richard Oudkerk
Richard Oudkerk added the comment: This bit looks wrong to me: if (offset - size > PY_SSIZE_T_MAX) /* Map area too large to fit in memory */ m_obj->size = (Py_ssize_t) -1; Should it not be "size - offset" instead of "offset - size"? (offset and size

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-24 Thread Richard Oudkerk
Richard Oudkerk added the comment: I suspect that the size of the 5GB file is originally a 64 bit quantity, but gets cast unsafely to a 32 bit size_t to give 1GB. This is causing the miscalculations. There is no way to map all of a 5GB file in a 32 bit process -- 4GB is the maximum -- so any

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-24 Thread Terry J. Reedy
Terry J. Reedy added the comment: Windows memory-maps multi-gigabyte files just fine as long as one uses the proper build (64-bit), which we provide. Given that mmap produces a finite-length sequence object, as documented, slicing is working as it should. Slicing beyond the length returns an

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I have only 32-bit OS and can't answer this questions. I'm surprised by 1 GiB limit too. Marc, can you please check 4.5 GiB file? What limit in this case, 1 GiB or 0.5 GiB? What about slicing a big bytes object or bytearray (if you have enough memory)? If mma

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: To me, Marc's title and penultimate sentence imply that he thinks that mmap should not accept such files. (But he should speak for himself.) As I said, not accepting such files could break working code. As for the alternative of 'fixing' methods: Is it only sl

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-23 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: As I understand, the issue is that mmap slicing returns an empty string for large (but less than ssize_t limit) indices on 2.7. May be it relates to 30-bit digits long integer implementation? -- nosy: +serhiy.storchaka __

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-23 Thread Terry J. Reedy
Terry J. Reedy added the comment: It is a report of behavior that lacks a specific request for change (that I can see). The implied code-change request could break working code. We don't usually do that. What do you think should be done? -- ___ Pyth

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-23 Thread Antoine Pitrou
Antoine Pitrou added the comment: Terry, what makes you think this is a feature request? This is a bug, quite simply. -- nosy: +pitrou versions: +Python 2.7, Python 3.2, Python 3.3 ___ Python tracker _

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-21 Thread Terry J. Reedy
Terry J. Reedy added the comment: The immediate fix is to use a 64 bit build. That aside, what change in behavior are you suggesting? (and for 32 bit builds only?) Should mmap.mmap warn if the file is longer that would be supported? This could be added to all current versions. Should it raise

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-21 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- components: +Library (Lib), Windows -None nosy: +brian.curtin, tim.golden versions: +Python 3.2, Python 3.3, Python 3.4 ___ Python tracker ___

[issue16743] mmap accepts files > 1 GB, but processes only 1 GB

2012-12-21 Thread Marc Schlaich
New submission from Marc Schlaich: Platform: Windows 7 64 bit Interpreter: Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit Intel)] on win32 Here are the steps to reproduce: 1. Create a big file (5 GB): with open('big', 'wb') as fobj: for _ in xrange(1024 * 1024 * 5):