On Sat, Sep 05, 2015 at 01:26:18PM +0200, Stefan Sperling wrote:
>
> > +static u_int32_t
> > +decode_utf8(const char *in, const char **nextc, int *had_error)
> > +{
>
> Please make sure this function performs the same validation checks
> as src/lib/libc/citrus/citrus_utf8.c:_citrus_utf8_ctype_mbrtowc()
>
> And if that libc function is missing checks you're doing here we should
> discuss about aligning the two.
>
We have a missing check in libc function.
RFC 3629 ask for limiting the range to 0x10FFFF:
https://tools.ietf.org/html/rfc3629#page-10
Currently, passing a c-string with "f7 bf bf bf" to mbrtowc(3) [with
UTF-8 locale], the function return 4 and make a wchar_t outside 0x10FFFF
limit.
With the following patch, the limit is checked, and the input is
considered as invalid.
Comments ? OK ?
--
Sebastien Marie
Index: citrus_utf8.c
===================================================================
RCS file: /cvs/src/lib/libc/citrus/citrus_utf8.c,v
retrieving revision 1.8
diff -u -p -r1.8 citrus_utf8.c
--- citrus_utf8.c 16 Jan 2015 16:48:51 -0000 1.8
+++ citrus_utf8.c 5 Sep 2015 12:26:50 -0000
@@ -169,6 +169,13 @@ _citrus_utf8_ctype_mbrtowc(wchar_t * __r
errno = EILSEQ;
return ((size_t)-1);
}
+ if (wch > 0x10ffff) {
+ /*
+ * Malformed input; invalid code points.
+ */
+ errno = EILSEQ;
+ return ((size_t)-1);
+ }
if (pwc != NULL)
*pwc = wch;
us->want = 0;