STINNER Victor added the comment:

In Python 3, reading ahead is implemented by _io.BufferedReader. This object 
uses a lock to provide a prevent race condition: it's not only to prevent 
crashes, but also provide warranties on how the file is read.

If thread A calls read() first, it gets the next bytes. If thread B calls 
read() while thread A is filling the internal file buffer ("readahead 
buffer"?), the second read is queued. The file position is only controlled by a 
single thread at the same time.

_PyOS_URandom() uses a similar strategy than Benjamin's proposed patch for the 
cached file descriptor of /dev/urandom:

    fd = _Py_open("/dev/urandom", O_RDONLY);
    if (fd < 0) {
        ...
        return -1;
    }
    if (urandom_cache.fd >= 0) {
        /* urandom_fd was initialized by another thread while we were
           not holding the GIL, keep it. */
        close(fd);
        fd = urandom_cache.fd;
    }
    else {
        ...
        urandom_cache.fd = fd;
    }

The difference is that opening /dev/urandom multiple times in parallel is safe, 
whereas reading from the same file descriptor in parellel... using the buffered 
fread()... is not safe. readahead() can require multiple fread() calls, so 
multiple read() syscalls. Interlaced reads in parallel is likely to return 
scrambled data.

Adding a lock in Python 2.7.15 can impact performances even on single threaded 
applications.

I'm not sure what whaters more here: performance or correctness?

Note: Even the awesome Python 3 io module has same flaws! 
https://bugs.python.org/issue12215 "TextIOWrapper: issues with interlaced 
read-write"

The question is more *who* reads from the same file object in parallel? Does it 
make sense? :-) Do you expect that file.read(n) is "atomic" in term of 
parallelism?

Note 2: the io module is also available in Python 2.7, just not used by default 
by the builtin open() function ;-) io.open() must be used explicitly.

----------
nosy: +pitrou

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31530>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to