strftime %b is broken on ja_JP locale

2010-05-12 Thread IWAMURO Motonori
Hi. strftime %b is broken on ja_JP locale on cygwin-1.7.5-1. [monthtest.c] #include #include #include int main(void) { time_t now; struct tm *tm; char buffer[4096]; setlocale(LC_ALL, "ja_JP.UTF-8"); time(&now); tm = localtime(&now); strftime(buffer, sizeof(buffer), "[%B]

Re: The C locale

2009-09-29 Thread IWAMURO Motonori
2009/9/29 Corinna Vinschen : > I asked if the default charset for the japanese language should be set > to EUCJP rather than SJIS.  The actual implementation would have been > like this > >  if (lang="xx or lang="xx_XX" with x in [a-z] and X in [A-Z]?) >    set_charset_from_codepage() > >  set_char

Re: The C locale

2009-09-29 Thread IWAMURO Motonori
2009/9/29 Corinna Vinschen : > The downside is that a user, who needs to work under the default ANSI > codepage for some reason, has to know the name of the default ANSI > codepage. If the problem is a problem of 1.5->1.7 migration, how about building in the wizard which sets the locale environmen

Re: The C locale

2009-09-29 Thread IWAMURO Motonori
2009/9/29 : > Also the following be suitable if possible.. >        LANG=ja -> iso-2022-jp >     LANG=ja_JP -> iso-2022-jp Hmmm, I think that it is unreal. -- IWAMURO Motnori -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/

Re: The C locale

2009-09-28 Thread IWAMURO Motonori
2009/9/27 IWAMURO Motonori : >> LANG="ja" -> EUCJP >> LANG="ja_JP" -> EUCJP > > Hmmm, It is a difficult problem. > > I think selecting UTF-8 is good because eucJP is legacy. > > But, for interoperability with other UNIX-like system(*), I

Re: The C locale

2009-09-26 Thread IWAMURO Motonori
2009/9/24 Corinna Vinschen : > My question is this:  Is the S-JIS implementation on UNIX systems > also using a different implementation to avoid using characters > from the ASCII range?  If so, can't we change the __sjis_wctomb > and __sjis_mbtowc functions in the same manner as the __eucjp_wctomb

Re: The C locale

2009-09-26 Thread IWAMURO Motonori
Hi. > the default ANSI and OEM codepage on Japanese Windows systems is > 932/SJIS, right? Yes. > LANG="C" -> UTF-8 (snip) > LANG="ja_JP.SJIS" -> SJIS It's good. > LANG="ja" -> EUCJP > LANG="ja_JP" -> EUCJP Hmmm, It is a difficult problem. I think selecting UTF-8 is good because eucJP is lega

Re: The C locale

2009-09-24 Thread IWAMURO Motonori
2009/9/24 Corinna Vinschen : > On Sep 24 16:03, IWAMURO Motonori wrote: >> 2009/9/22 Andy Koppe : >> > Let's use the Windows "ANSI" codepage as the character set for the C >> > locale, for both the conversion functions and filenames. This means >>

Re: The C locale

2009-09-24 Thread IWAMURO Motonori
2009/9/22 Andy Koppe : > Let's use the Windows "ANSI" codepage as the character set for the C > locale, for both the conversion functions and filenames. This means > CP1252 on Western systems, CP1251 on Cyrillic ones, CP932 on Japanese > ones, and so on. I oppose the approach (the ANSI codepage is

Re: The C locale

2009-09-02 Thread IWAMURO Motonori
Hi. 2009/9/2 Andy Koppe : > I see two good solutions: > - Use the default Windows codepage for filenames, console, and > multibyte functions. This is what happens already if you specifiy a > locale with a language but no charset, e.g. "en". Maximum 1.5 > compatibility. > - Use UTF-8 throughout. Fu

Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests])

2009-06-27 Thread IWAMURO Motonori
Hi. 2009/6/27 Andy Koppe : > And then there's the Linux compatibility angle, where ja_JP.UTF-8 > means ambiguous width 1 not 2. I want you not to judge it based on the behavior of current Linux. Because: - I don't think the behavior is correct. - Now, I am creating the patch for the problem. --

Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests])

2009-06-15 Thread IWAMURO Motonori
OK. I withdraw my proposal. 2009/6/16 Corinna Vinschen : > On Jun 15 23:35, IWAMURO Motonori wrote: >> 2009/6/15 Corinna Vinschen: >> > If everybody agrees to this suggestion, here's the patch. >> >> Is the name of modifier prefix "cjk-" good? It i

Re: [PATCH] Add "@cjknarrow" modifier (was Re: [Fwd: [1.7] wcwidth failing configure tests])

2009-06-15 Thread IWAMURO Motonori
2009/6/15 Corinna Vinschen : >> Yes, but the guideline exists. >> http://cygwin.com/ml/cygwin/2009-05/msg00444.html > > A single mail in a single mailing list of a single project.  That's rather > a suggestion than a guideline... Sorry, my writing was bad. My quotation is a part of Unicode Standar

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-14 Thread IWAMURO Motonori
2009/6/13 Corinna Vinschen : >> I'm not sure which "standard" you are referring to. > > The problem appears to be that there is no standard for the handling > of ambiguous characters. Yes, but the guideline exists. http://cygwin.com/ml/cygwin/2009-05/msg00444.html > 2) Unicode Standard Annex #11 >

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-14 Thread IWAMURO Motonori
2009/6/13 Thomas Wolff : > I have checked source data files in /usr/share/i18n/charmaps on my Linux > system, e.g. "UTF-8.gz". > character widths are the same for all locales with the same "charmap". It was reported as a bug, but it isn't fixed now...X-( http://sourceware.org/bugzilla/show_bug.

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-06 Thread IWAMURO Motonori
2009/6/6 Corinna Vinschen : > I vote for @cjkwide, regardless of Andy's objection.  People using CJK > will know the meaning and it has the additional advantage to be a rather > simple to memorize identifier. I oppose @cjkwide approach because I don't think that I need make special cases give prio

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-06 Thread IWAMURO Motonori
2009/6/6 Andy Koppe : > However, to make the locale setting more convenient for CJK users, > there could be modifiers for both widths. Without modifier, the CJK > locales would default to "Ambiguous Wide", while everything else would > default to "Ambiguous Narrow". It is acceptable for me. > Puz

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-06 Thread IWAMURO Motonori
# Continuation of discussion. # # I hope that all the applications work correctly only by setting "LANG=ja_JP.UTF-8". # I don't hope that I give up the use of the binary packages and that I keep applying many local patches. > I don't think that it is the good idea because: > > - It is "a cygwin-s

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-06-06 Thread IWAMURO Motonori
I oppose your proposal because I think that it is useless for us. 2009/6/6 Thomas Wolff : > the intention is that the "codepage" information should be the same > for all locales having thbe "UTF-8" (or any other) charmap. So you > cannot freely change width information among locales with the same

[1.7][BUG] MOJIBAKE title bar

2009-06-04 Thread IWAMURO Motonori
Hi. The title bar is MOJIBAKE when the following 'wintitle.sh' works on command prompt in the UTF-8 environment (for example: LANG=ja_JP.UTF-8). http://vmi.jp/tmp/wintitle.sh http://vmi.jp/tmp/01good-mintty.png is the good result on MinTTY. http://vmi.jp/tmp/02bad-cmd.png is the bad result on co

[1.7][BUG] winsup/cygwin/strfuncs.cc

2009-06-03 Thread IWAMURO Motonori
Hi. I found a trivial bug. *pmbs is unsigned char. '\x80' is -128 because it is char literal (not unsigned char). -> "*pmbs > '\x80'" is always true. # Is not "> 0x80" but ">= 0x80" correct? --- winsup/cygwin/strfuncs.cc 31 May 2009 03:59:38 - 1.30 +++ winsup/cygwin/strfuncs.cc 3 J

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
And, I think that UTF-8 is best solution when the setting of LC_CTYPE category is C. 2009/6/4 IWAMURO Motonori : > I think that this problem is caused by missing setting the locale > environment variable. > Therefore, I think that the problem can be solved by compelling the >

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
FU > > On Jun  4 00:03, IWAMURO Motonori wrote: >> 2009/6/3 Corinna Vinschen >> > What's left as questionable is the LANG=C default case.  Due to the >> > discussion from the last month we now use UTF-8 as default encoding, >> > because it's the only e

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-06-03 Thread IWAMURO Motonori
Hi. How about the addition of the setting of the locale environment variable (like LANG) to the Cygwin installer? 2009/6/3 Corinna Vinschen : > On Jun  3 09:18, Edward Lam wrote: >> Corinna Vinschen wrote: >>> The question is, what do you expect?  [...] >> [...] >> Wikipedia has several suggestio

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread IWAMURO Motonori
I think that you should set "export LANG=en_US.ISO-8859-1" instead of "export LANG=LANG=en_US.ISO-8859-1". 2009/5/30 Edward Lam : > IWAMURO Motonori wrote: >> >> The encoding of C locale is ASCII, and not ISO-8859-1. >> I don't think ASCII is t

Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line

2009-05-29 Thread IWAMURO Motonori
Hi. The encoding of C locale is ASCII, and not ISO-8859-1. I don't think ASCII is the same as ISO-8859-1. Does it work on LANG=en_US.ISO-8859-1? 2009/5/29 Edward Lam : > Alexey Borzenkov wrote: >> >> On Thu, May 28, 2009 at 7:28 PM, Edward Lam wrote: >>> >>> PS. In case you haven't noticed, copy

Re: [1.7] wprintf is broken?

2009-05-27 Thread IWAMURO Motonori
Sorry, my report is not correct. Because we must mix neither wide-character I/O nor multibyte-character I/O in the specification. (see the manual of fwide() function) 2009/5/17 Corinna Vinschen : > On May 16 23:56, IWAMURO Motonori wrote: >> Hi. >> >> wprintf is broken? >

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-05-26 Thread IWAMURO Motonori
I correct my proposal. 2009/5/15 IWAMURO Motonori : > I propose to use *_cjk() when the language part of LC_CTYPE > is 'ja', 'ko', 'vi' or 'zh'. LC_CTYPE is 'ja', 'ko', or 'zh'. I remove 'vi'. (advice from a Ne

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-05-20 Thread IWAMURO Motonori
2009/5/21 Thomas Wolff : >> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE >> > is 'ja', 'ko', 'vi' or 'zh'. > The problem with this is > 1. As you say, there is no standard. But, - I think that my proposal doesn't violate any specification. - I heard that there is an exi

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-16 Thread IWAMURO Motonori
2009/5/17 Lenik : > Thanks, but where can I get this patch? You can checkout it from CVS HEAD. -- IWAMURO Motnori -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/do

[1.7] wprintf is broken?

2009-05-16 Thread IWAMURO Motonori
Hi. wprintf is broken? I compile & run the following source: #include #include #include int main(void) { setlocale(LC_ALL, "en_US.UTF-8"); wprintf(L"%ls\n", L"Test\n"); printf("Test\n"); return 0; } Result text: http://vmi.jp/tmp/wprintf-is-broken.txt -- IWAMURO Motnori

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-15 Thread IWAMURO Motonori
2009/5/15 Corinna Vinschen : > I have just trouble with SJIS, but that's not something I can easily > test. Maybe you can look into that in the next couple of days? Maybe I can. Please explain details of the trouble. -- IWAMURO Motnori -- Unsubscribe info: http://cygwin.com

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-14 Thread IWAMURO Motonori
2009/5/15 Corinna Vinschen : > Here's one problem.  What if an application uses setenv("LANG", ...)? Oh. Hmmm, I think that anything should not occur. > Do you want Cygwin to intercept all calls to setenv() to check for > setting $LC_ALL/LC_CTYPE/LANG? No. I think that only setlocale() has to do

Re: [Fwd: [1.7] wcwidth failing configure tests]

2009-05-14 Thread IWAMURO Motonori
2009/5/13 Corinna Vinschen : >> http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > > This looks nice. Do you import Markus Kuhn's wcwidth implementation? >> Trouble is, there's the thorny issue of the "CJK Ambiguous Width" >> category of characters, which consists of things like Greek and >> Cyrillic

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-14 Thread IWAMURO Motonori
2009/5/14 Corinna Vinschen : > I see a couple of potential problems. What problems are those? > And have some time to discuss whether these are something the > user can or even should fix or workaround alone. I think that the application that use locale by the environment variable and the applic

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-14 Thread IWAMURO Motonori
2009/5/14 Corinna Vinschen : >> > Should the following part not be modified? >> > >> > winsup/cygwin/fhandler_console.cc: >> > > dev_state->con_mbtowc = __mbtowc; >> > > dev_state->con_wctomb = __wctomb; >> >> I'd rather not.  It only affects the console and if LANG=C I'd rather >> see the single b

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-13 Thread IWAMURO Motonori
2009/5/14 Corinna Vinschen : > I already wrote that patch, see > http://cygwin.com/ml/cygwin-cvs/2009-q2/msg00066.html > It seems to do what you are proposing. I read it and built cygwin1.dll. It seems to work correctly. Should the following part not be modified? winsup/cygwin/fhandler_console.c

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-13 Thread IWAMURO Motonori
2009/5/14 Corinna Vinschen : > That's basically how my patch works. Sorry, I can't parse this sentence because of my poor English parser... Do you be writing the patch for this problem? > Btw., if you plan to write more and bigger patches for Cygwin, it would > be necessary to sign a copyright as

Re: [1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-13 Thread IWAMURO Motonori
coding while processing, I think that the problem is a responsibility of the application. 2009/5/13 Corinna Vinschen : > On May 12 19:37, Corinna Vinschen wrote: >> On May 13 02:29, IWAMURO Motonori wrote: >> > I propose that the filename encoding in C locale uses UTF-8 instead of

[1.7] Proposal: the filename encoding in C locale uses UTF-8 instead of SO/UTF-8

2009-05-12 Thread IWAMURO Motonori
Hi. I propose that the filename encoding in C locale uses UTF-8 instead of SO/UTF-8. There are three reasons: 1. for the interoperability between Cygwin and various UNIX-like systems (Linux, *BSD, Solaris, and so on). UNIX-like systems treat the filename as 8bit byte array, and many applicati

Re: [1.7][python] File operation API to multibyte filenames fails.

2009-05-08 Thread IWAMURO Motonori
2009/5/9 Corinna Vinschen : > Cool. Thanks for the patch. This actually solves the problem. > I applied the patch with just a little tweak. Thanks. The following patch might be better. --- a/winsup/cygwin/strfuncs.cc Thu May 07 12:29:17 2009 +0900 +++ b/winsup/cygwin/strfuncs.cc Sat May 09 04:

Re: [1.7][python] File operation API to multibyte filenames fails.

2009-05-08 Thread IWAMURO Motonori
Sorry, test code is bad. - printf("%d\n", ent->d_name, errno); + printf("%d\n", errno); -- IWAMURO Motnori -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.

Re: [1.7][python] File operation API to multibyte filenames fails.

2009-05-08 Thread IWAMURO Motonori
2009/5/9 Corinna Vinschen : > can't see a fault in Cygwin. Neither from strace, nor in a GDB session. > The readdir calls return the filenames using the SO sequences so that > a valid byte-stream is created which also works in the C locale. > However, for some reason there's a EILSEQ (138) errno ge

Re: [1.7][python] File operation API to multibyte filenames fails.

2009-05-08 Thread IWAMURO Motonori
Hi. 2009/5/8 Corinna Vinschen : > Your scripts.  Python correctly doesn't use setlocale because it's > the responsibility of the application to set the local if it uses > non-ASCII chars.  And Cygwin simply has no chance to convert UTF-8 > to UTF-16 if the application doesn't ask for UTF-8. Oh, i

[1.7][python] File operation API to multibyte filenames fails.

2009-05-08 Thread IWAMURO Motonori
Hi. File operation API to multibyte filenames fails on Python and Cygwin-1.7. Which Python or Cygwin-1.7 should be fixed? My environment: Windows XP SP3, Cygwin-1.7.0-46, and LANG=ja_JP.UTF-8 The following code fails on the directory which has multibyte filenames: >>> import os >>> os.listdir("

[1.7] cygstart with non-ASCII arguments and UTF-8 locale don't work.

2009-04-27 Thread IWAMURO Motonori
Hi. cygstart with non-ASCII arguments and UTF-8 locale don't work on cygwin-1.7.0. > ls -l total 1 -rw-rw-r-- 1 iwa None 7 Apr 28 00:22 αβγ.txt > cygstart αβγ.txt Unable to start 'C:\cygwin-1.7\tmp\αβγ.txt': The specified file was not found. -- IWAMURO Motnori cygstart.patch D