[issue26158] File truncate() not defaulting to current position as documented

Eryk Sun Tue, 19 Jan 2016 15:35:49 -0800

Eryk Sun added the comment:

FYI, you can parse the cookie using struct or ctypes. For example:


    class Cookie(ctypes.Structure):
        _fields_ = (('start_pos',     ctypes.c_longlong),
                    ('dec_flags',     ctypes.c_int),
                    ('bytes_to_feed', ctypes.c_int),
                    ('chars_to_skip', ctypes.c_int),
                    ('need_eof',      ctypes.c_byte))

In the simple case only the buffer start_pos is non-zero, and the result of 
tell() is just the 64-bit file pointer. In Serhiy's UTF-7 example it needs to 
also convey the bytes_to_feed and chars_to_skip values:

    >>> f.tell()
    680564735109527527154978616360239628288
    >>> cookie_bytes = f.tell().to_bytes(ctypes.sizeof(Cookie), sys.byteorder)
    >>> state = Cookie.from_buffer_copy(cookie_bytes)
    >>> state.start_pos
    0
    >>> state.dec_flags
    0
    >>> state.bytes_to_feed
    16
    >>> state.chars_to_skip
    2
    >>> state.need_eof
    0

So a seek(0, SEEK_CUR) in this case has to seek the buffer to 0, read and 
decode 16 bytes, and skip 2 characters. 

Isn't this solvable at least for the case of truncating, Martin? It could do a 
tell(), seek to the start_pos, read and decode the bytes_to_feed, re-encode the 
chars_to_skip, seek back to the start_pos, write the encoded characters, and 
then truncate.

    >>> f = open('temp.txt', 'w+', encoding='utf-7')
    >>> f.write(b'+BDAEMQQyBDMENA-'.decode('utf-7'))
    5
    >>> _ = f.seek(0); f.read(2)
    'аб'
    >>> cookie_bytes = f.tell().to_bytes(sizeof(Cookie), byteorder)
    >>> state = Cookie.from_buffer_copy(cookie_bytes)
    >>> f.buffer.seek(state.start_pos)
    0
    >>> buf = f.buffer.read(state.bytes_to_feed)
    >>> s = buf.decode(f.encoding)[:state.chars_to_skip]
    >>> f.buffer.seek(state.start_pos)
    0
    >>> f.buffer.write(s.encode(f.encoding))
    8
    >>> f.buffer.truncate()
    8
    >>> f.close()
    >>> open('temp.txt', encoding='utf-7').read()
    'аб'

Rewriting the encoded bytes is necessary to properly terminate the UTF-7 
sequence, which makes me doubt whether this simple approach will work for all 
codecs. But something like this is possible, no?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26158>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue26158] File truncate() not defaulting to current position as documented

Reply via email to