On 2025-01-14 Paul Eggert wrote:
> On 2025-01-14 02:43, Lasse Collin wrote:
> > Pedantic interpretation might be that EILSEQ isn't allowed in this
> > situation because EOVERFLOW already describes the error condition.  
> 
> No, because EOVERFLOW is intended for things like inode numbers don't 
> fit. It is not intended for things like invalid byte sequences.

I too associate EOVERFLOW with numbers, not strings. Your
interpretation is the most reasonable.

Don't take the following too seriously (feel free to even ignore). :-)
I noticed that a discussion similar to ours happened around glibc's
readdir_r in 2013. While it's a deprecated function, it's errno usage
is very similar to readdir. The discussion was about

  - ENAMETOOLONG vs. EOVERFLOW (vs. ERANGE) and which is allowed
    by POSIX,

  - delaying the error until the end of the directory, and

  - if one can continue reading the directory after some errors or if
    it can result in an infinite loop.

It's not a long thread:

    https://sourceware.org/legacy-ml/libc-alpha/2013-05/msg00445.html

From the current glibc manual section '(libc)Reading/Closing Directory':

    To distinguish between an end-of-directory condition or an error,
    you must set ‘errno’ to zero before calling ‘readdir’.  To avoid
    entering an infinite loop, you should stop reading from the
    directory after the first error.

    ...

    • On some systems, ‘readdir_r’ cannot read directory entries
      with very long names.  If such a name is encountered, the GNU
      C Library implementation of ‘readdir_r’ returns with an error
      code of ‘ENAMETOOLONG’ after the final directory entry has
      been read.

Those paragraphs were added as part of the readdir_r fix:

    
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=91ce40854d0b7f865cf5024ef95a8026b76096f3

The current readdir_r code:

    
https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/readdir_r.c;hb=2642002380aafb71a1d3b569b6d7ebeab3284816

It doesn't count the number of problematic files. If my reading is
correct, at the end of the directory the code will keep returning NULL
with errno = ENAMETOOLONG. If so, one shouldn't continue after
ENAMETOOLONG with this implementation. It seems that continuing after an
error needs platform-specific knowledge in practice. (Maybe EOVERFLOW
is an exception.)

The behavior isn't needed in readdir in glibc. It's in readdir_r only
which one hopefully won't use anyway.

> > Other apps on Windows may trip on EILSEQ  
> 
> We can cross those bridges when we come to them.
> 
> There is no perfect solution here. But for 'ls', the only program
> where we know this matters, EILSEQ would be better than EOVERFLOW.

Now I think that it's good enough if readdir on Windows delays EILSEQ
(and ENAMETOOLONG) until the end of the directory and returns that
error once. That is, if readdir is called again, it returns NULL
without modifying errno to prevent infinite loops.

It means that apps cannot count the errors and only one error code is
remembered if two types of errors occur, but it's simpler code. Most
apps stop after the first error anyway. The upside is that then there
is no pressure to make GNU ls to continue reading the directory after
EILSEQ. ;-)

-- 
Lasse Collin

Reply via email to