Crap alert. Seriously. My impression is, whereever you look in
the wide character functions in our libc, it's broken. At least,
i looked at two files of twenty lines each and found a bug in each
of them. I don't believe in luck. I expect more cheap targets.
When errno happens to be EILSEQ upon entry to fgetws(3), and when
the file ends without a terminating L'\n' character, fgetws(3)
discards any characters read and reports bogus EOF.
Consider this test program:
#include <err.h>
#include <locale.h>
#include <stdlib.h>
#include <stdio.h>
#include <wchar.h>
int
main(void)
{
wchar_t ws[16];
setlocale(LC_CTYPE, "en_US.UTF-8");
mbtowc(NULL, "\xff", MB_CUR_MAX);
warn("mbtowc");
if (fgetws(ws, sizeof(ws) / sizeof(*ws), stdin) == NULL)
err(1, "fgetws");
fputws(ws, stdout);
return 0;
}
Output before:
$ echo -n X | ./getws
getws: mbtowc: Illegal byte sequence
getws: fgetws: Illegal byte sequence
Output with the following patch applied:
$ echo -n X | ./getws
getws: mbtowc: Illegal byte sequence
X
Please speak after me:
Never inspect errno except right after an error occurred.
There are more horrors in this twenty line file. Notice how,
in the case above, "goto error;" does NOT result in an error
return, but in reporting EOF. The label name is really
misleading. And notice that in case of an error, fgetws(3)
neither leaves ws alone, nor NUL terminates it. Yesyes,
nobody should inspect it when fgetws(3) returns NULL, but
it's insecure style anyway. But one thing at a time, let's
first fix the outright bug.
OK?
Ingo
Index: fgetws.c
===================================================================
RCS file: /cvs/src/lib/libc/stdio/fgetws.c,v
retrieving revision 1.7
diff -u -p -r1.7 fgetws.c
--- fgetws.c 31 Aug 2015 02:53:57 -0000 1.7
+++ fgetws.c 24 Dec 2015 20:20:31 -0000
@@ -52,9 +52,9 @@ fgetws(wchar_t * __restrict ws, int n, F
wsp = ws;
while (n-- > 1) {
- if ((wc = __fgetwc_unlock(fp)) == WEOF && errno == EILSEQ) {
+ if ((wc = __fgetwc_unlock(fp)) == WEOF &&
+ ferror(fp) && errno == EILSEQ)
goto error;
- }
if (wc == WEOF) {
if (wsp == ws) {
/* EOF/error, no characters read yet. */