Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Brian Inglis via Cygwin
Wolff via Cygwin wrote: Am 15.09.2024 um 20:15 schrieb Thomas Wolff via Cygwin: Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Christian Franke via Cygwin
Thomas Wolff via Cygwin: Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to access the file. Testcase with U+1F321

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Jeremy Drake via Cygwin
On Thu, 19 Sep 2024, Brian Inglis via Cygwin wrote: > On 2024-09-19 07:27, Christian Franke via Cygwin wrote: > > > > > > Yes, but Cygwin does not provide consistent forward/reverse UTF-8 <-> UTF-16 > > mappings. > > Surrogates halves are invalid for

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Cedric Blancher via Cygwin
ian Franke via Cygwin wrote: > >>>> Thomas Wolff via Cygwin wrote: > >>>>> Am 15.09.2024 um 20:15 schrieb Thomas Wolff via Cygwin: > >>>>>> Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: > >>>>>>> If a file

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Brian Inglis via Cygwin
15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to access the file. Testcase with U+1F321 (Thermometer): $ uname -r 3.5.4-1

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-19 Thread Christian Franke via Cygwin
file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to access the file. Testcase with U+1F321 (Thermometer): $ uname -r 3.5.4-1.x86_64 $ printf $'\U0001F321' | od -A none -t

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-17 Thread Mark Liam Brown via Cygwin
n Franke via Cygwin: > >>>> If a file name contains an invalid (truncated) UTF-8 sequence, open() > >>>> does not refuse to create the file. Later readdir() returns a > >>>> different name which could not be used to access the file. > >&

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-16 Thread Christian Franke via Cygwin
Christian Franke via Cygwin wrote: Thomas Wolff via Cygwin wrote: Am 15.09.2024 um 20:15 schrieb Thomas Wolff via Cygwin: Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-15 Thread Christian Franke via Cygwin
Thomas Wolff via Cygwin wrote: Am 15.09.2024 um 20:15 schrieb Thomas Wolff via Cygwin: Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-15 Thread Thomas Wolff via Cygwin
Am 15.09.2024 um 20:15 schrieb Thomas Wolff via Cygwin: Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to

Re: readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-15 Thread Thomas Wolff via Cygwin
Am 15.09.2024 um 19:47 schrieb Christian Franke via Cygwin: If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to access the file. Testcase with U+1F321 (Thermometer

readdir() returns inaccessible name if file was created with invalid UTF-8

2024-09-15 Thread Christian Franke via Cygwin
If a file name contains an invalid (truncated) UTF-8 sequence, open() does not refuse to create the file. Later readdir() returns a different name which could not be used to access the file. Testcase with U+1F321 (Thermometer): $ uname -r 3.5.4-1.x86_64 $ printf $'\U0001F321' | od

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-18 Thread Kaz Kylheku (Cygwin) via Cygwin
On 2020-10-14 14:47, Jérôme Froissart wrote: The choice of GetCommandLineA was for illustration purposes; had I used GetCommandLineW I would not be able to printf using %ls under CMD.EXE, because of code page issues. However here is a modified version of the test program that uses GetCommandLineW

Re: UTF-8 quoted args passed to program include quotes when run from cmd

2020-10-14 Thread Brian Inglis
rôme" >>> Now, let's start a Windows shell (cmd.exe) >>> Note that I had to copy cygwin1.dll from my Cygwin installation >>> directory, otherwise binary.exe would not start. >>> I do not know whether there is a `locale` equivalent in Windows >>>

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-14 Thread Jérôme Froissart
Le mer. 14 oct. 2020 à 23:47, Jérôme Froissart a écrit : > However, there is still a question that is puzzling me. I now > understand _why_ things happen that way, but I am still wondering > whether this is really what we _want_. I mean, keeping the double > quotes around an UTF-8 a

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-14 Thread Jérôme Froissart
really what we _want_. I mean, keeping the double quotes around an UTF-8 argument just because it is not run from Cygwin's bash sounds like a bug for me, doesn't it? (yet I definitely understand the reasons that explain this behaviour). Since I cannot run my program from bash, I have to

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-13 Thread Brian Inglis
On 2020-10-06 15:36, Jérôme Froissart wrote: > Here are the more detailed steps to reproduce the issue (along with > answers to your requests about `uname`, `locale`, etc.). > (I mostly reproduced what billziss-gh had done before, I do not take > all the credits :D) > > Here is an example C file >

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-13 Thread Kaz Kylheku (Cygwin) via Cygwin
On 2020-10-06 14:36, Jérôme Froissart wrote: Here is an example C file $ cat example.c #include const char *GetCommandLineA(void); int main(int argc, char *argv[]) { const char *s = GetCommandLineA(); printf("C=%s\n", s); for (int i = 0; argc > i; i

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-11 Thread Andrey Repin via Cygwin
ilt from Cygwin, but it is then used > as a standalone executable, without any GUI. It is called by a Windows > component/driver (with a command line that contains quoted UTF-8 > arguments), invoked by some clicks and actions from the 'My computer' > window. What could I do so

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-07 Thread Brian Inglis
On 2020-10-07 18:59, Eliot Moss wrote: > I think what we mean is that, under Windows cmd, some things the shell does > for > you under Linux and Cygwin will not have been done.  For example, there is > "glob" expansion of filenames.  If I write *.txt under bash, it gets expanded > to > a space-se

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-07 Thread Eliot Moss
I think what we mean is that, under Windows cmd, some things the shell does for you under Linux and Cygwin will not have been done. For example, there is "glob" expansion of filenames. If I write *.txt under bash, it gets expanded to a space-separated list of names of files that match that pat

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-07 Thread Brian Inglis
On 2020-10-06 23:17, Thomas Wolff wrote: > > > Am 06.10.2020 um 23:36 schrieb Jérôme Froissart: >> Thanks for your replies. >> This issue only happens when a program is run from cmd.exe, not from a >> Cygwin bash shell. >> This is important for me, since I discovered this bug in a project >> that

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-07 Thread Jérôme Froissart
utable, without any GUI. It is called by a Windows component/driver (with a command line that contains quoted UTF-8 arguments), invoked by some clicks and actions from the 'My computer' window. What could I do so that this program correctly handles the command line? > 2. Then you are

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-06 Thread Thomas Wolff
Am 06.10.2020 um 23:36 schrieb Jérôme Froissart: Thanks for your replies. This issue only happens when a program is run from cmd.exe, not from a Cygwin bash shell. This is important for me, since I discovered this bug in a project that must be run from Windows graphical shell (i.e. there is no

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-06 Thread Brian Inglis
On 2020-10-06 15:36, Jérôme Froissart wrote: > Thanks for your replies. > This issue only happens when a program is run from cmd.exe, not from a > Cygwin bash shell. > This is important for me, since I discovered this bug in a project > that must be run from Windows graphical shell (i.e. there is n

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-06 Thread Andrey Repin via Cygwin
Greetings, Jérôme Froissart! > Now, let's start a Windows shell (cmd.exe) That explains it. > Note that I had to copy cygwin1.dll from my Cygwin installation > directory, otherwise binary.exe would not start. > I do not know whether there is a `locale` equivalent in Windows We've specifically a

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-06 Thread Jérôme Froissart
Thanks for your replies. This issue only happens when a program is run from cmd.exe, not from a Cygwin bash shell. This is important for me, since I discovered this bug in a project that must be run from Windows graphical shell (i.e. there is no sensible way to run it through Cygwin and Bash). > P

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-04 Thread Andrey Repin via Cygwin
ng command line > binary.exe --non-ascii "charaçtérs" --ascii "nothing-fancy-here" > as > argv = ["binary.exe", > "--non-ascii", > "chara\xXX\xXXt\xXX\xXXrs", > "--ascii", >

Re: Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-02 Thread Doug Henderson via Cygwin
On Fri, 2 Oct 2020 at 15:41, Jérôme Froissart <> wrote: > > By discussing a merge request on another project [1], I think > billziss-gh found a weirdness in the way Cygwin parses the command > line arguments when non-ASCII characters come into play. > > EXPECTED BEHAVIOUR: > cygwin should parse the

Unconsistent command-line parsing in case of UTF-8 quoted arguments

2020-10-02 Thread Jérôme Froissart
"charaçtérs" --ascii "nothing-fancy-here" as argv = ["binary.exe", "--non-ascii", "chara\xXX\xXXt\xXX\xXXrs", "--ascii", "nothing-fancy-here"] // \xXX\xXX being the UTF-8 encoding

Re: Mintty fails to render 3 byte UTF-8 on Windows 7

2018-10-12 Thread Andrey Repin
Greetings, Steven Penny! > Test B > -- > set font for mintty.exe to Consolas and test: > chcp.com 65001 Not related to Cygwin in the slightest. The best it can do it suggest native Windows console tools to use CP65001 (UTF-8), which very few programs react to. >

Re: Mintty fails to render 3 byte UTF-8 on Windows 7

2018-10-12 Thread L A Walsh
Referring to the below paragraph, I would not be against you providing a patch to fix it. I doubt any working on the tools or cygwin in their spare time would mind either and would probably appreciate the help. --- When do you plan to submit a fix? linda :-) p.s. - just another cygwin

Mintty fails to render 3 byte UTF-8 on Windows 7

2018-10-11 Thread Steven Penny
tested with Windows 7 and Windows 8.1. Test A -- install: usr/share/fonts/noto/NotoSansMyanmar-Regular.ttf from: noto-myanmar-fonts Link font (need relog after): set 'HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontLink\SystemLink' reg add "$1" /t REG_MULTI_SZ /v Consolas /d No

Re: UTF-8 character encoding

2018-06-26 Thread Lee
On 6/26/18, Michael Enright wrote: > On Mon, Jun 25, 2018 at 11:33 AM, Lee wrote: >> I'm still trying to figure utf-8 out, but it seems to me that 0x0 - >> 0xff is part of the utf-8 encoding. > > I don't see how you arrived at this. I screwed up trying to do hex i

Re: UTF-8 character encoding

2018-06-26 Thread Lee
On 6/26/18, Thomas Wolff wrote: > This encoding scheme is wrong; where did you get it from? Maybe it's the > obsolete UTF-8... http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt I thought I saw something about utf-8 being able to handle a 31 bit value.. is that also obsolete/

Re: UTF-8 character encoding

2018-06-26 Thread Michael Enright
On Mon, Jun 25, 2018 at 11:33 AM, Lee wrote: > I'm still trying to figure utf-8 out, but it seems to me that 0x0 - > 0xff is part of the utf-8 encoding. I don't see how you arrived at this. An initial byte of 0xFF is not the initial byte of any valid UTF-8 byte sequence. And it

Re: UTF-8 character encoding

2018-06-26 Thread Thomas Wolff
Am 25.06.2018 um 20:33 schrieb Lee: On 6/24/18, L A Walsh wrote: Lee wrote: So... keep it simple, set LANG=en_US.UTF-8 and use vi or something else that comes with cygwin to create the file and I'll have a file with UTF-8 character encoding - correct? --- The first 127 chara

Re: UTF-8 character encoding

2018-06-25 Thread Lee
On 6/24/18, L A Walsh wrote: > Lee wrote: >> So... keep it simple, set >> LANG=en_US.UTF-8 >> and use vi or something else that comes with cygwin to create the file >> and I'll have a file with UTF-8 character encoding - correct? > --- > The first 12

Re: UTF-8 character encoding

2018-06-24 Thread L A Walsh
Lee wrote: So... keep it simple, set LANG=en_US.UTF-8 and use vi or something else that comes with cygwin to create the file and I'll have a file with UTF-8 character encoding - correct? --- The first 127 characters of UTF-8 are identical to the first 127 characters of ASCII

Re: UTF-8 character encoding

2018-06-22 Thread Andrey Repin
Greetings, Lee! > On 6/20/18, Andrey Repin wrote: >> Greetings, Lee! >> >>> I'm looking at >>> https://cygwin.com/packaging-hint-files.html#pvr.hint >>> and it starts off with >>> Use UTF-8 character encoding. >> >>

Re: UTF-8 character encoding

2018-06-21 Thread Lee
On 6/20/18, Andrey Repin wrote: > Greetings, Lee! > >> I'm looking at >> https://cygwin.com/packaging-hint-files.html#pvr.hint >> and it starts off with >> Use UTF-8 character encoding. > >> How do I do that and how do I check that I actually did use

Re: UTF-8 character encoding

2018-06-21 Thread Houder
On Thu, 21 Jun 2018 12:12:39, Houder wrote: > On Wed, 20 Jun 2018 14:09:59, Lee wrote: > > I'm looking at > > https://cygwin.com/packaging-hint-files.html#pvr.hint > > and it starts off with > > Use UTF-8 character encoding. > > > > How do I do tha

Re: UTF-8 character encoding

2018-06-21 Thread Houder
On Wed, 20 Jun 2018 14:09:59, Lee wrote: > I'm looking at > https://cygwin.com/packaging-hint-files.html#pvr.hint > and it starts off with > Use UTF-8 character encoding. > > How do I do that and how do I check that I actually did use UTF-8 > character encoding _

Re: UTF-8 character encoding

2018-06-20 Thread Andrey Repin
Greetings, Lee! > I'm looking at > https://cygwin.com/packaging-hint-files.html#pvr.hint > and it starts off with > Use UTF-8 character encoding. > How do I do that and how do I check that I actually did use UTF-8 > character encoding _without_ using file? https:/

Re: UTF-8 character encoding

2018-06-20 Thread Stefan Weil
Am 20.06.2018 um 20:09 schrieb Lee: > I'm looking at > https://cygwin.com/packaging-hint-files.html#pvr.hint > and it starts off with > Use UTF-8 character encoding. > > How do I do that and how do I check that I actually did use UTF-8 > character encoding _wit

UTF-8 character encoding

2018-06-20 Thread Lee
I'm looking at https://cygwin.com/packaging-hint-files.html#pvr.hint and it starts off with Use UTF-8 character encoding. How do I do that and how do I check that I actually did use UTF-8 character encoding _without_ using file? for whatever it's worth: $ file unicode.html unicode.

Re: Need help with multibyte UTF-8 characters

2017-12-15 Thread Thomas Wolff
Am 15.12.2017 um 01:32 schrieb Brian Inglis: On 2017-12-12 12:42, Thomas Taylor wrote: I believe that Cygwin displays certain UTF-8 characters incorrectly.  To see the problem, first save the attached "utf-8_test.sed" text file to your desktop. Then run "mintty," and set

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread Brian Inglis
On 2017-12-12 12:42, Thomas Taylor wrote: > I believe that Cygwin displays certain UTF-8 characters incorrectly.  To see > the > problem, first save the attached "utf-8_test.sed" text file to your desktop.  > Then run "mintty," and set its options by right clicki

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread Andrey Repin
Greetings, Thomas Taylor! > I believe that Cygwin displays certain UTF-8 characters incorrectly.  To > see the problem, first save the attached "utf-8_test.sed" text file to > your desktop.  First, your "NBSP" is actually http://www.fileformat.info/info/unicode/ch

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread Brian Inglis
On 2017-12-11 16:36, Thomas Taylor wrote: > Thank you for your advice on setting my locale to en_US.UTF-8.  Unfortunately, > Cygwin still seems to have trouble displaying some three-byte UTF-8 encoded > characters correctly.  For example, see the following snippet from a "sed"

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread Thomas Wolff
Am 14.12.2017 um 17:21 schrieb cyg Simple: On 12/14/2017 3:55 AM, Thomas Wolff wrote:> Mintty interfaces to Windows using the Unicode/UTF-16 API, so there is no dependency on the Windows system locale. I assume the original poster's problem is a font issue, unless a test case would demonstrate a

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread cyg Simple
On 12/14/2017 3:55 AM, Thomas Wolff wrote:> Mintty interfaces to Windows using the Unicode/UTF-16 API, so there is > no dependency on the Windows system locale. > I assume the original poster's problem is a font issue, unless a test > case would demonstrate anything else. > Thomas > I seem to rem

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread cyg Simple
Character Set should be set to match. >>> The profile commands below set Cygwin locale to your Windows Regional >>> settings >>> and charset to UTF-8, or Unix locale to your system locale. >>> Otherwise your system or mintty is going to be doing conversions on

Re: Need help with multibyte UTF-8 characters

2017-12-14 Thread Thomas Wolff
profile commands below set Cygwin locale to your Windows Regional settings and charset to UTF-8, or Unix locale to your system locale. Otherwise your system or mintty is going to be doing conversions on each character. I am not aware that mintty character display and Windows regional settings would

Re: Need help with multibyte UTF-8 characters

2017-12-13 Thread Brian Inglis
nds below set Cygwin locale to your Windows Regional >> settings >> and charset to UTF-8, or Unix locale to your system locale. >> Otherwise your system or mintty is going to be doing conversions on each >> character. > I am not aware that mintty character display and Win

Re: Need help with multibyte UTF-8 characters

2017-12-13 Thread cyg Simple
On 12/13/2017 2:50 AM, Thomas Wolff wrote: > Hi Brian, > > Am 13.12.2017 um 06:21 schrieb Brian Inglis: >> On 2017-12-04 18:23, Thomas Taylor wrote: >>> I want to use multibyte UTF-8 characters in 64-bit Cygwin under >>> Windows 7.  The >>> "vim&

Re: Need help with multibyte UTF-8 characters

2017-12-12 Thread Thomas Wolff
Hi Brian, Am 13.12.2017 um 06:21 schrieb Brian Inglis: On 2017-12-04 18:23, Thomas Taylor wrote: I want to use multibyte UTF-8 characters in 64-bit Cygwin under Windows 7.  The "vim" editor running in mintty displays the two-byte characters correctly, but not the three- (and I a

Re: Need help with multibyte UTF-8 characters

2017-12-12 Thread Brian Inglis
On 2017-12-04 18:23, Thomas Taylor wrote: > I want to use multibyte UTF-8 characters in 64-bit Cygwin under Windows 7.  > The > "vim" editor running in mintty displays the two-byte characters correctly, but > not the three- (and I assume four-) byte characters, wh

Re: Need help with multibyte UTF-8 characters

2017-12-12 Thread Thomas Wolff
Am 12.12.2017 um 00:36 schrieb Thomas Taylor: ... This file attempts to convert XML-encoded filenames to UTF-8.  ... How about a generic script, like: sed -e 's,%,\\x,g' -e "s,^,echo $'," -e "s,$,'," | sh -- Problem reports: http://cygwin.com/p

Re: Need help with multibyte UTF-8 characters

2017-12-12 Thread Thomas Taylor
I believe that Cygwin displays certain UTF-8 characters incorrectly.  To see the problem, first save the attached "utf-8_test.sed" text file to your desktop.  Then run "mintty," and set its options by right clicking in its title bar, selecting "Options" and then

Re: Need help with multibyte UTF-8 characters

2017-12-12 Thread Doug Henderson
On 11 December 2017 at 16:36, Thomas Taylor wrote: > Thank you for your advice on setting my locale to en_US.UTF-8. > Unfortunately, Cygwin still seems to have trouble displaying some three-byte > UTF-8 encoded characters correctly. For example, see the following snippet > from

Re: Need help with multibyte UTF-8 characters

2017-12-11 Thread Thomas Taylor
Thank you for your advice on setting my locale to en_US.UTF-8.  Unfortunately, Cygwin still seems to have trouble displaying some three-byte UTF-8 encoded characters correctly.  For example, see the following snippet from a "sed" file.  This file attempts to convert XML-encoded fi

Re: Need help with multibyte UTF-8 characters

2017-12-04 Thread Brian Inglis
On 2017-12-04 18:23, Thomas Taylor wrote: > I want to use multibyte UTF-8 characters in 64-bit Cygwin under Windows 7.  > The > "vim" editor running in mintty displays the two-byte characters correctly, but > not the three- (and I assume four-) byte characters, wh

Need help with multibyte UTF-8 characters

2017-12-04 Thread Thomas Taylor
I want to use multibyte UTF-8 characters in 64-bit Cygwin under Windows 7.  The "vim" editor running in mintty displays the two-byte characters correctly, but not the three- (and I assume four-) byte characters, which instead display as rectangular filled-in blocks.  The "less&qu

Re: UTF-8 compatibility between Windows and Cygwin

2017-05-25 Thread Andrey Repin
Greetings, Nellis, Kenneth! > I have (BOM-less) UTF-8 text files that I can read fine in > Cygwin, but not Windows. When I create text files in Windows > containing non-ASCII characters, I cannot read them in > Cygwin. I understand why, but wondering the best way to be > abl

UTF-8 compatibility between Windows and Cygwin

2017-05-25 Thread Nellis, Kenneth
I have (BOM-less) UTF-8 text files that I can read fine in Cygwin, but not Windows. When I create text files in Windows containing non-ASCII characters, I cannot read them in Cygwin. I understand why, but wondering the best way to be able to share text files across the two environments. I&#

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-10-20 Thread Ivan Vanyushkin
Hello Corinna, Wednesday, October 19, 2016, 2:45:16 PM, you wrote: > I applied a patch to fix this regression and uploaded a developer > snapshot with this change to https://cygwin.com/snapshots/ > Please test. Just exchanging the Cygwin DLL is sufficient, no need to > install the entire package

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-10-19 Thread Corinna Vinschen
On Oct 1 05:13, Ivan Vanyushkin wrote: > Something has changed in version 2.6.0, and now UTF-8 text can't be displayed > in Windows console (cmd). > > 1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding. > 2. Run "cmd". &

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-10-01 Thread Ivan Vanyushkin
I want to share binary built under Cygwin 2.6.0 with other user, that has no LANG set. In previous version all binaries worked correctly with UTF-8 input text. But now this doesn't work as expected. Some more simple tests. // Run Windows console. cmd C:\Cygwin_2.6.0\bin\echo ±5° ▒▒5

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-10-01 Thread Bengt Larsson
Ivan Vanyushkin wrote: >Something has changed in version 2.6.0, and now UTF-8 text can't be displayed >in Windows console (cmd). > >1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding. >2. Run "cmd"

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-10-01 Thread Ivan Vanyushkin
uot; - will not work for not-English Windows, because output in console will not be readable. Watch Windows log: tail -f C:\Windows\Logs\SomeLog.log - will be not readable if there are some non-English file names. I think locale should remain default UTF-8, as in Cygwin 2.5.2. This is expe

Re: Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-09-30 Thread Brian Inglis
On 2016-09-30 22:34, Brian Inglis wrote: On 2016-09-30 20:13, Ivan Vanyushkin wrote: Something has changed in version 2.6.0, and now UTF-8 text can't be displayed in Windows console (cmd). 1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding. 2. Run "cmd&quo

Cygwin 2.6.0: unreadable UTF-8 in Windows console

2016-09-30 Thread Ivan Vanyushkin
Something has changed in version 2.6.0, and now UTF-8 text can't be displayed in Windows console (cmd). 1. Create a file "test.txt" with non-ASCII text in UTF-8 encoding. 2. Run "cmd". 3. Run: C:\Cygwin\bin\cat test.txt ▒▒

Re: [bug] mingw64-*-w64-win-iconv: Cannot open handle; convert to UTF-8

2016-03-19 Thread sdbenique
Just a follow-up to this issue. It appears my test program *was* invalid, but I discovered why SDL wouldn't load properly. As you can see in my initial bug report, SDL was attempting to convert a command line from UCS-2-INTERNAL to UTF-8 using win-iconv. "C" (as my test

Re: [bug] mingw64-*-w64-win-iconv: Cannot open handle; convert to UTF-8

2016-03-12 Thread Yaakov Selkowitz
in both 32-bit and 64-bit builds. [snip] iconv_t handle = iconv_open("C", "UTF-8"); Invalid code. "C" is a locale, not an encoding. -- Yaakov -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentat

[bug] mingw64-*-w64-win-iconv: Cannot open handle; convert to UTF-8

2016-03-12 Thread sdbenique
. Stepping into that function, the following executes: Breakpoint 2, SDL_iconv_string (tocode=0x4052bc <__dyn_tls_init_callback+684> "UTF-8", fromcode=0x4052ad <__dyn_tls_init_callback+669> "UCS-2-INTERNAL", inbuf=0x2e2dd2 "C", inbytesleft=116)

Re: With bad UTF-8, cygwin can create files it can't read

2015-04-01 Thread Corinna Vinschen
On Apr 1 10:01, Warren Young wrote: > On Apr 1, 2015, at 7:34 AM, Corinna Vinschen > wrote: > > > > As you probably know, Unicode values beyond the base plane (that is, > > everything > 0x in UTF-32 and > ef bf bf in UTF-8 notation) > > are represented

Re: With bad UTF-8, cygwin can create files it can't read

2015-04-01 Thread Corinna Vinschen
On Apr 1 15:34, Corinna Vinschen wrote: > Hi Stuart, > > On Mar 30 13:04, Corinna Vinschen wrote: > > On Mar 25 14:34, Kyzer wrote: > > > Hello, > > > > > > I've found that if you use cygwin to create a file with badly-encoded > > > UTF

Re: With bad UTF-8, cygwin can create files it can't read

2015-04-01 Thread Warren Young
On Apr 1, 2015, at 7:34 AM, Corinna Vinschen wrote: > > As you probably know, Unicode values beyond the base plane (that is, > everything > 0x in UTF-32 and > ef bf bf in UTF-8 notation) > are represented as so-called surrogate pairs in UTF-16, two UTF-16 > values i

Re: With bad UTF-8, cygwin can create files it can't read

2015-04-01 Thread Corinna Vinschen
Hi Stuart, On Mar 30 13:04, Corinna Vinschen wrote: > On Mar 25 14:34, Kyzer wrote: > > Hello, > > > > I've found that if you use cygwin to create a file with badly-encoded > > UTF-8, readdir() gives out an entry with a name that cygwin won't > > su

Re: With bad UTF-8, cygwin can create files it can't read

2015-03-30 Thread Corinna Vinschen
On Mar 25 14:34, Kyzer wrote: > Hello, > > I've found that if you use cygwin to create a file with badly-encoded > UTF-8, readdir() gives out an entry with a name that cygwin won't > subsequently accept. > > * create a file using filename with hex bytes F4 8F

With bad UTF-8, cygwin can create files it can't read

2015-03-25 Thread Kyzer
Hello, I've found that if you use cygwin to create a file with badly-encoded UTF-8, readdir() gives out an entry with a name that cygwin won't subsequently accept. * create a file using filename with hex bytes F4 8F BF BF * readdir() reports the filename as hex bytes E2 8E B

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Ken Brown
On 9/11/2014 6:21 AM, Sebastien Vauban wrote: Achim Gratz wrote: Both fonts you use as an example exist in multiple versions with differing UTF-8 support. If they don't have that glyph (which is likely, given the results you report), then Emacs would try to get it from another font wit

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Andrey Repin
;> >> You seem to assume that those fonts define that particular glyph. > Yes, I was. >> Both fonts you use as an example exist in multiple versions with >> differing UTF-8 support. If they don't have that glyph (which is >> likely, given the results you report), then

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Sebastien Vauban
seem to assume that those fonts define that particular glyph. Yes, I was. > Both fonts you use as an example exist in multiple versions with > differing UTF-8 support. If they don't have that glyph (which is > likely, given the results you report), then Emacs would try to get it >

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Achim Gratz
r glyph. Both fonts you use as an example exist in multiple versions with differing UTF-8 support. If they don't have that glyph (which is likely, given the results you report), then Emacs would try to get it from another font with the same dimensions (I don't know if mintty does font su

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Marco Atzeri
n Consolas while for example 25B8 (small black ) and 25BA (large black) are available. In other fonts (like Courier new) only 25BA is available and 25B8 is missing... I don't know enough about fonts and UTF-8 encoding to be able to shed any more light on this. Maybe someone else can help.

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-11 Thread Sebastien Vauban
t; - win32 Emacs always can display it, in all fonts, >> >> - Cygwin Emacs can't display it with Consolas, Courier New and Lucida >>(among others). >> >> MWE: >> >> --8<---cut here---start->8--- >> ;; t

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-10 Thread Ken Brown
ygwin Emacs can't display it with Consolas, Courier New and Lucida (among others). MWE: --8<---cut here---start->8--- ;; these fonts only display (special?) UTF-8 chars (here: the white ;; right-pointing triangle) in win32 binary of Emacs (modify-all-f

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-03 Thread Sebastien Vauban
;> - win32 Emacs always can display it, in all fonts, >> >> - Cygwin Emacs can't display it with Consolas, Courier New and Lucida >>(among others). >> >> MWE: >> >> --8<---cut here---start->8--- >>

Re: Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-03 Thread Ken Brown
ygwin Emacs can't display it with Consolas, Courier New and Lucida (among others). MWE: --8<---cut here---start->8--- ;; these fonts only display (special?) UTF-8 chars (here: the white ;; right-pointing triangle) in win32 binary of Emacs (m

Font support of UTF-8 chars differ between w32 Emacs and Cygwin Emacs

2014-09-03 Thread Sebastien Vauban
ourier New and Lucida (among others). MWE: --8<---cut here---start->8--- ;; these fonts only display (special?) UTF-8 chars (here: the white ;; right-pointing triangle) in win32 binary of Emacs (modify-all-frames-parameters '((font . "Consolas-

Cygwin needs a man-db port (was: How does Cygwin handle non-Latin1 man pages? (move to UTF-8?))

2014-03-14 Thread Erwin Waterlander
l show the French man page correctly. Latin-1 is converted to UTF-8. For the Russian translation of the vim manual I see two files: /usr/share/man/ru.UTF-8/man1/vim.1.gz /usr/share/man/ru.KOI8-R/man1/vim.1.gz When I type $ export LANG=ru_RU.UTF-8 $ man vim I get the English man page, instead of t

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Christopher Faylor
On Wed, Dec 11, 2013 at 06:01:03PM +0100, Corinna Vinschen wrote: >>Perhaps this will require reiteration and reclarification on Thursday, >>feline-permitting. > >And it's not even my WJM week. Can we move that to Thursday next week? Sorry, no. I can't allow that. But, then, it's my week. cgf

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Corinna Vinschen
> On Dec 11 19:02, Mikhail Usenko wrote: > >> > > I couldn't figure out how a POSIX filename passed to a Cygwin > >> > > application running on the Windows system may become longer than > >> > > NAME_MAX=1020 bytes if the maximum filename length

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Christopher Faylor
;t figure out how a POSIX filename passed to a Cygwin >> > > application running on the Windows system may become longer than >> > > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255 >> > > UTF-16 symbols (i.e. 1020 bytes for the

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Corinna Vinschen
on the Windows system may become longer than > > > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255 > > > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code > > > unit)? > > > > Read my mail again. NAME_MAX is 255. > > >

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Mikhail Usenko
s if the maximum filename length in NTFS is 255 > > UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code > > unit)? > > Read my mail again. NAME_MAX is 255. > > > Corinna Corinna, why not 1020? -- -- Problem reports: http://cygwin.com

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Corinna Vinschen
On Dec 11 19:02, Mikhail Usenko wrote: > I couldn't figure out how a POSIX filename passed to a Cygwin > application running on the Windows system may become longer than > NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255 > UTF-16 symbols (i.e. 1020 bytes for the b

Re: cant access to files more than 128 utf-8 symbol long names

2013-12-11 Thread Mikhail Usenko
I couldn't figure out how a POSIX filename passed to a Cygwin application running on the Windows system may become longer than NAME_MAX=1020 bytes if the maximum filename length in NTFS is 255 UTF-16 symbols (i.e. 1020 bytes for the biggest 4 byte UTF-8 code unit)? What causes the ENAMETO

  1   2   3   4   >