New submission from Wolfgang Maier:

I thought I'd go back to work on a test patch for issue21560 today, but now I'm 
puzzled by the explicit handling of memoryviews in gzip.GzipFile.write.
The method is defined as:

    def write(self,data):
        self._check_closed()
        if self.mode != WRITE:
            import errno
            raise OSError(errno.EBADF, "write() on read-only GzipFile object")

        if self.fileobj is None:
            raise ValueError("write() on closed GzipFile object")

        # Convert data type if called by io.BufferedWriter.
        if isinstance(data, memoryview):
            data = data.tobytes()

        if len(data) > 0:
            self.size = self.size + len(data)
            self.crc = zlib.crc32(data, self.crc) & 0xffffffff
            self.fileobj.write( self.compress.compress(data) )
            self.offset += len(data)

        return len(data)

So for some reason, when it gets passed data as a meoryview it will first copy 
its content to a bytes object and I do not understand why.
zlib.crc32 and zlib.compress seem to be able to deal with memoryviews so the 
only sepcial casing that seems required here is in determining the byte length 
of the data, which I guess needs to use memoryview.nbytes. I've prepared a 
patch (overlapping the one for issue21560) that avoids copying the data and 
seems to work fine.

Did I miss something about the importance of the tobytes conversion ?

----------
components: Library (Lib)
messages: 238294
nosy: wolma
priority: normal
severity: normal
status: open
title: unnecessary copying of memoryview in gzip.GzipFile.write ?
type: resource usage
versions: Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue23688>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to