[issue13171] Bug in tempfile module

2011-10-13 Thread Alexander Steppke

New submission from Alexander Steppke :

The tempfile module shows strange behavior under certain conditions. This might 
lead to data leaking or other problems. 

The test session looks as follows:

Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tmp = tempfile.TemporaryFile()
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'P\xf6D\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ [ommitted]'

or similar behavior in text mode: 

Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on 
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tmp = tempfile.TemporaryFile('w+t')
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'\x00\xa5\x8b\x02int or long, hash(a) is used instead.\ni\x10 [ommitted]'
>>> tmp.seek(0)
>>> tmp.readline()
'test\x00\xa5\x8b\x02int or long, hash(a) is used instead.\n'

This bug seems to be triggered by calling tmp.read() before tmp.seek(). I am 
running Python 2.7.2 on Windows 7 x64, other people have reproduced the problem 
on Windows XP but not under Linux or Cygwin (see also 
http://stackoverflow.com/questions/7757663/python-tempfile-broken-or-am-i-doing-it-wrong).

Thank you for looking into this.
Alexander

--
components: Library (Lib), Windows
messages: 145477
nosy: Alexander.Steppke
priority: normal
severity: normal
status: open
title: Bug in tempfile module
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue13171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13171] Bug in tempfile module

2011-10-14 Thread Alexander Steppke

Alexander Steppke  added the comment:

Hi David,

I followed your suggestion and tried to reproduce the problem without the 
tempfile module. It turns out that is indeed an underlying issue. I am not sure 
what the root cause is but now this is even a bigger problem: read() returns 
information from some file/memory that it was never intended to access. 

The session looks similar to the tempfile session:

>>> tmp = open('tmp', 'w+t')
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'hp\'\x02\xe4\xb9>7\x80\x88\x81\x02\x01\x00\x00\x00\x00\x00\x00\x00\x12\x00\x00\
x00\xe86(\x02p\x11\x8d\x02\x01\x00\x00\x00@\xfd)\x02\xe7Y\x9aN\x01\x00\x00\x00\x
00\x00\x00\x00\x14\x00\x00\x00\x087(\x02\x00\x00\x00\x00\xe9Y\x0b\xa2\x00\x93+\x
02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x9b,\x02\x02\x00\x00\x00\xe06(\x02\xc0W5\

At the moment the bug could only be reproduced using CPython 2.7.1 on Windows 
XP and Windows 7. 

Alexander

--

___
Python tracker 
<http://bugs.python.org/issue13171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13171] Bug in file.read(), can access unknown data.

2011-10-14 Thread Alexander Steppke

Changes by Alexander Steppke :


--
components: +IO
title: Bug in tempfile module -> Bug in file.read(), can access unknown data.

___
Python tracker 
<http://bugs.python.org/issue13171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13171] Bug in file.read(), can access unknown data.

2011-10-14 Thread Alexander Steppke

Alexander Steppke  added the comment:

Additionally after calling tmp.close() the file 'tmp' contains the string 
'test', which is followed by about 4kB of binary data similar to the previous 
output of tmp.read().

--

___
Python tracker 
<http://bugs.python.org/issue13171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13171] Bug in file.read(), can access unknown data.

2011-10-14 Thread Alexander Steppke

Alexander Steppke  added the comment:

Thank you for the update Victor. It seems to me that this is exactly the same 
issue.

At the moment the current documentation says 
(http://docs.python.org/library/stdtypes.html#bltin-file-objects):

"Note: This function is simply a wrapper for the underlying fread() C function, 
and will behave the same in corner cases, such as whether the EOF value is 
cached."

This is a hint to the current behavior but I would not expect from this that 
file.read() can return any kind of data, if used directly after file.write(). 
Maybe one could include a link or a snippet of the C standard which states that 
one shall not do this:

"When a file is opened with update mode ('+' as the second or third character 
in the above list of mode argument values), both input and output may be 
performed on the associated stream. However, output shall not be directly 
followed by input without an intervening call to the fflush function or to a 
file positioning function (fseek, fsetpos, or rewind), and input shall not be 
directly followed by output without an
intervening call to a file positioning function, unless the input operation 
encounters end-of-file." 
 
(from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, page 272)

--

___
Python tracker 
<http://bugs.python.org/issue13171>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com