On Jul 23 05:44, Thomas Wolff via Cygwin wrote: > OK, suppose I'd consider to switch to mbs[[n]r]towcs, collecting bytes until > the function gives me a result. > This would work fine as long as I receive only valid sequences. But look at > input string test case > char nonbmp[] = {0xF8, 0x88, 0x8A, 0xAF, 0x2D, 0}; // an invalid sequence > followed by a valid char > The functions only return -1 and (in the case of mbsnrtowcs) do not advance > the input pointer. > So how am I supposed to recognize that the invalid sequence has ended and a > valid character has arrived?
Yeah, I see the problem. One of the slightly puzzeling behaviours of mbsnrtowcs is the fact that the src pointer stays at the start of the invalid sequence. I think the idea is to skip the invalid sequence byte-wise until wcsnrtombs reports a valid sequence again. What bugs me is that we have the choice between a broken mbrtowc on one side and a chance to generate broken filenames on the other side. I think we should actually revert fa272e05bbd0 ("wcstombs: also call __WCTOMB on terminating NUL if output buffer is NULL") and see if we can fix the filename issue in the Cygwin functions for filename conversion alone. Any ideas appreciated. Corinna -- Problem reports: https://cygwin.com/problems.html FAQ: https://cygwin.com/faq/ Documentation: https://cygwin.com/docs.html Unsubscribe info: https://cygwin.com/ml/#unsubscribe-simple