On Thu 11 Jan 2018 22:55, Mark H Weaver <m...@netris.org> writes: > l...@gnu.org (Ludovic Courtès) writes: > >> Mark H Weaver <m...@netris.org> skribis: >> >>> l...@gnu.org (Ludovic Courtès) writes: >> >> [...] >> >>>> + if (SCM_UNBUFFEREDP (port) && (avail < max_buffer_size)) >>>> + { >>>> + /* PORT is unbuffered. Read as much as possible from PORT. */ >>>> + size_t read; >>>> + >>>> + bv = scm_c_make_bytevector (max_buffer_size); >>>> + scm_port_buffer_take (buf, (scm_t_uint8 *) SCM_BYTEVECTOR_CONTENTS >>>> (bv), >>>> + avail, cur, avail); >>>> + >>>> + read = scm_i_read_bytes (port, bv, avail, >>>> + SCM_BYTEVECTOR_LENGTH (bv) - avail); >>> >>> Here's the R6RS specification for 'get-bytevector-some': >>> >>> "Reads from BINARY-INPUT-PORT, blocking as necessary, until bytes are >>> available from BINARY-INPUT-PORT or until an end of file is reached. >>> If bytes become available, 'get-bytevector-some' returns a freshly >>> allocated bytevector containing the initial available bytes (at least >>> one), and it updates BINARY-INPUT-PORT to point just past these >>> bytes. If no input bytes are seen before an end of file is reached, >>> the end-of-file object is returned." >>> >>> By my reading of this, we should block only if necessary to ensure that >>> we return at least one byte (or EOF). In other words, if we can return >>> at least one byte (or EOF), then we must not block, which means that we >>> must not initiate another 'read'. >> >> Indeed. So perhaps the condition above should be changed to: >> >> if (SCM_UNBUFFEREDP (port) && (avail == 0)) >> >> ? > > That won't work, because the earlier call to 'scm_fill_input' will have > already initiated a 'read' if the buffer was empty. The read buffer > size will determine the maximum number of bytes read, which will be 1 in > the case of an unbuffered port. So, at the point of this condition, > 'avail == 0' will occur only if EOF was encountered, in which case you > must return EOF without attempting another 'read'. > > In order to avoid unnecessary blocking, there must be only one 'read' > call, and it must be initiated only if the buffer was already empty. > > So, in order to accomplish your goal here, I don't see how you can use > 'scm_fill_input', unless you temporarily increase the size of the read > buffer beforehand. > > Instead, I think you need to first check if the read buffer contains any > bytes. If so, empty the buffer and return them. If the buffer is > empty, the next thing to check is 'scm_port_buffer_has_eof_p'. If it's > set, then clear that flag and return EOF. > > Otherwise, if the buffer is empty and 'scm_port_buffer_has_eof_p' is > false, then you must do what 'scm_fill_input' would have done, except > using your larger buffer instead of the port's internal read buffer. In > particular, you must first switch the port to "reading" mode, flushing > the write buffer if 'rw_random' is set. > > Also, I'd prefer to move this code to ports.c in order to avoid adding > more internal declarations to ports.h and changing more functions from > 'static' to global functions.
I agree with Mark here -- thanks for the close review. >>> Out of curiosity, is there a reason why you're using an unbuffered port >>> in your use case? >> >> It’s to implement redirect à la socat: >> >> >> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=17af5d51de7c40756a4a39d336f81681de2ba447 > > Why is an unbuffered port being used here? Can we change it to a > buffered port? This was also a question I had! If you make it a buffered port at 4096 bytes (for example), then get-bytevector-some works exactly like you want it to, no? Andy