On Wed, Jun 03, 2009 at 12:55:57PM -0400, Edward Lam wrote: >Corinna Vinschen wrote: >> On Jun 3 12:02, Christopher Faylor wrote: >>> On Wed, Jun 03, 2009 at 04:27:55PM +0200, Corinna Vinschen wrote: >>>> On Jun 3 09:18, Edward Lam wrote: >>>>> Corinna Vinschen wrote: >>>>>> The question is, what do you expect? [...] >>>>> [...] >>>>> Wikipedia has several suggestions on how to handle invalid UTF-8 byte >>>>> sequences (http://en.wikipedia.org/wiki/UTF-8). Personally, I favor the >>>>> rule that uses the replacement character. >>>> Chris implemented using the invalid code point solution. The discussion >>>> in http://www.mail-archive.com/linux-u...@nl.linux.org/msg00080.html >>>> supports this solution. What's missing so far is the way back, from >>>> an invalid single second half of a surrogate pair in the 0xDCxx range >>>> back to the correct byte value. I'm just looking into that. >>> The way back was not, AFAIK, needed for Cygwin programs. I don't think >>> there is a valid way back for Windows programs. >> >> The way back is not needed for the argv handling in Cygwin, but it >> gets necessary if you converted to UTF-16 in other circumstances. >> It's not much of a problem since the way back is a no-brainer, in >> contrast to the conversion to UTF-16. > >What is the current state of affairs in cygwin 1.7.0-48? Is the invalid >code point solution currently being used when converting the command >line to UTF-16 when spawning non-cygwin processes? What I'm trying to >understand is where the command line truncation is taking place, in the >parent or child process. > >If the truncation is happening in the child process because of the >invalid code point, then perhaps we should consider using the >replacement character solution when spawning non-cygwin child processes. >IMHO, having a bad character is better than having a truncated command >line. At least, the problem (invalid UTF-8) then becomes more obvious.
As Corinna said above: "Chris implemented using the invalid code point solution" That's what is in Cygwin's CVS and in the latest snapshot. cgf -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/