New submission from Erik Bray:

I've come across a few difficulties of late with the io module's handling of 
files opened in append mode (any variation on 'a', 'ab', 'a+', 'ab+', etc.

The biggest problem is that the io module does not in any way keep track of 
whether a file was opened in append mode, and it's essentially impossible to 
determine the original mode string that was provided by the user.  For example:

>>> f = open('test', mode='ab+', buffering=0)
>>> f
<_io.FileIO name='test' mode='rb+'>

The 'a' is gone.  That doesn't mean the file *isn't* in append mode.  If 
supported, in fileio_init this still causes the O_APPEND flag to be added to 
the open() call.  But the *only* way to find out after the fact that the file  
was actually opened in append mode is with fcntl:

>>> fcntl.fcntl(f.fileno(), fnctl.F_GETFL) & os.O_APPEND
1024

but this is hardly easily accessible or portable.  So it's possible to have two 
files open in 'rb+' mode but that have wildly differing behaviors.

The only other thing fileio_init does differently with append mode is that it 
seeks to the end of the file by default.  But that does not make the append 
behavior "portable".  If, on a system where O_APPEND was not supported, I seek 
to a different part of the file and the call write() it will *not* append to 
the end of the file.  Whereas the behavior of O_APPEND causes an automatic seek 
to the end before any write().

The fact that no record of the request for 'append' mode is kept leads to 
further bugs, particularly in BufferedWriter.  It doesn't know the raw file was 
opened with O_APPEND so the writes it shows in the buffer differ from what will 
actually end up in the file.  For example:

>>> f = open('test', 'wb')
>>> f.write(b'testest')
7
>>> f.close()
>>> f = open('test', 'ab+')
>>> f.tell()
7
>>> f.write(b'A')
1
>>> f.seek(0)
0
>>> f.read()
b'testestA'
>>> f.seek(0)
0
>>> f.read(1)
b't'
>>> f.write(b'B')
1
>>> f.seek(0)
0
>>> f.read()
b'tBstestA'
>>> f.flush()
>>> f.seek(0)
0
>>> f.read()
b'testestAB'

In this example, I read 1 byte from the beginning of the file, then write one 
byte.  Because of O_APPEND, the effect of the write() call on the raw file is 
to append, regardless of where BufferedWriter seeks it to first.  But before 
the f.flush() call f.read() just shows what's in the buffer which is not what 
will actually be written to the file.  (Naturally, unbuffered io does not have 
this particular problem.)

So, I'm thinking maybe the fileio struct needs to grow an 'append' member.  
This could be used to provide a more accurate mode string, and could for 
example in fileio_write to provide append-like support where it isn't natively 
supported (though perhaps without any guarantees as to atomicity).

----------
components: IO
messages: 196464
nosy: erik.bray
priority: normal
severity: normal
status: open
title: Problems with files opened in append mode with io module
versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18876>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to