Re: [PATCH] CJK ambiguous width for non-Unicode charsets

2010-11-17 Thread Andy Koppe
Documentation change to go with the newlib patch at
http://www.cygwin.com/ml/newlib/2010/msg00604.html:

* setup2.sgml (setup-locale-ov): Document CJK ambiguous width change
for non-Unicode charsets.
* new-features.sgml (ov-new1.7.8): Mention CJK ambiguous width change.

(Btw, "Drop support for Windows NT4 prior to Service Pack 4" appears
twice in new-features.sgml).

Andy


ambiwidth2-doc.patch
Description: Binary data


locale initialization issue

2011-05-03 Thread Andy Koppe
Hi,

I stumbled across an issues with locale initialization when the "C"
locale is specified in the environment.

$ cat test.c
#include 
#include 
#include 
#include 

int main(void) {
  char cs[8];
  puts(nl_langinfo(CODESET));
  printf("%i\n", wctomb(cs, 0x80));
  return 0;
}

The program doesn't call setlocale, so it should be using the "C"
locale with its ASCII charset, which means the wctomb() call with a
codepoint outside the ASCII range should fail. And that's exactly what
happens as long as the locale set in the environment is something
other than "C", e.g.:

$ LC_ALL=C.UTF-8 ./test
ANSI_X3.4-1968
-1

$ LC_ALL=en_GB.ISO-8859-15 ./test
ANSI_X3.4-1968
-1

However, if the environment locale is "C", the charset is still
reported as ASCII (aka ANSI_X3.4-1968), but the wctomb call suddenly
succeeds:

$ LC_ALL=C ./a
ANSI_X3.4-1968
2

That's due to a combination of three things: Cygwin newlib starts with
the __wctomb and __mbtowc function pointers set to the UTF-8 variants
(for conversions during early Cygwin initialization), yet the LC_CTYPE
locale is set to "C", and setlocale() does nothing if the requested
locale is the same as the previous one.

Hence, with the locale set to "C" in the environment, both the
setlocale call from initial_setlocale(), which asks for the
environment locale for filename conversion, and the setlocale() just
before main() that sets the "C" locale, end up doing nothing. Thus the
conversion functions remain set to the UTF-8 variants instead of being
set to the ASCII ones as intended for the "C" locale.

The attached small patch addresses this by starting with the LC_CTYPE
locale set to "C.UTF-8"  and lc_ctype_charset set accordingly too.
This means that setting the "C" locale is recognised as a change and
that the conversion function pointers are updated accordingly. It also
has the happy side effect that the setlocale call from
initial_setlocale() will be short-circuited if the default "C.UTF-8"
locale has not been overridden in the environment.

Additionally, I think it's time to drop the "temporarily" #if 0'd code
for making UTF-8 the charset for the "C" locale.

It's a newlib patch, but it's entirely Cygwin-specific, so it seemed
more appropriate to send it here.

* libc/locale/locale.c [__CYGWIN__]
(current_categories, lc_ctype_charset): Start with the LC_CTYPE locale
set to "C.UTF-8", to match initial __wctomb and __mbtowc settings.
(lc_message_charset, loadlocale): Settle on ASCII as the "C" charset.

Andy


lc_ctype.patch
Description: Binary data


Add locale.exe option for querying Windows UI languages

2011-10-08 Thread Andy Koppe
The attached patch adds a --interface/-i option to locale.exe that
makes the --system/-s and --user/-u options print the respective
default UI language instead of the default locale.

* locale.cc: Add --interface option for printing Windows default UI
languages.

For background, here's what Windows' various default locales and languages do:

- LOCALE_USER_DEFAULT: This reflects the setting on the Formats tab of
the (Windows 7) Region&Language control panel, which affects the
format of times, dates, numbers, and currency.

- LOCALE_SYSTEM_DEFAULT: This reflects the "Language for non-Unicode
programs" on the Adminstrative tab of Region&Language control panel,
which also determines the ANSI and OEM codepages.

- GetUserDefaultUILanguage(): This is the current user's Windows UI
language, also called display language. On Windows installs with
multiple UI languages, a setting for this appears on the "Keyboards
and Languages" tab of the Region&Language control panel.

- GetSystemDefaultUILanguage(): The is the system-wide UI language
used for things that aren't user-specific, e.g. the login screen. As
far as I know it's determined at Windows install time and can''t be
changed.

(The latter two APIs are available from Windows 2000 onwards.)

Looking at those, and if we wanted to base the Cygwin locale settings
on the Windows ones, I think LC_NUMERIC, LC_TIME, and LC_MONETARY
should be determined by LOCALE_USER_DEFAULT, but LC_MESSAGES should be
determined by GetUserDefaultUILanguage(). Not sure about LC_CTYPE and
LC_COLLATE, but I suppose it would make sense for character
classification and sorting to match the UI language.

See also this blog post by MS's "Dr International" Michael Kaplan:
http://blogs.msdn.com/b/michkap/archive/2006/05/13/596971.aspx

Andy


ui_lang.patch
Description: Binary data


Re: Add locale.exe option for querying Windows UI languages

2011-10-09 Thread Andy Koppe
On 8 October 2011 16:03, Corinna Vinschen wrote:
>
> On Oct  8 10:24, Andy Koppe wrote:
>> The attached patch adds a --interface/-i option to locale.exe that
>> makes the --system/-s and --user/-u options print the respective
>> default UI language instead of the default locale.
>>
>>       * locale.cc: Add --interface option for printing Windows default UI
>>       languages.
>>
>> For background, here's what Windows' various default locales and languages 
>> do:
>>
>> - LOCALE_USER_DEFAULT: This reflects the setting on the Formats tab of
>> the (Windows 7) Region&Language control panel, which affects the
>> format of times, dates, numbers, and currency.
>>
>> - LOCALE_SYSTEM_DEFAULT: This reflects the "Language for non-Unicode
>> programs" on the Adminstrative tab of Region&Language control panel,
>> which also determines the ANSI and OEM codepages.
>>
>> - GetUserDefaultUILanguage(): This is the current user's Windows UI
>> language, also called display language. On Windows installs with
>> multiple UI languages, a setting for this appears on the "Keyboards
>> and Languages" tab of the Region&Language control panel.
>>
>> - GetSystemDefaultUILanguage(): The is the system-wide UI language
>> used for things that aren't user-specific, e.g. the login screen. As
>> far as I know it's determined at Windows install time and can''t be
>> changed.
>
> I like the idea of the patch, but I'm wondering if this is the right
> approach.  I wasn't aware of the difference between the LOCALE_FOO_DEFAULT
> values and what the GetFooDefaultUILanguage functions return, otherwise
> I would have probably used the GetFooDefaultUILanguage functions right from
> the start.
>
> What I mean is this.  The locale -u/-s functionality was supposed to be
> used to set the $LANG value preferredly.  Since LANG means language in
> the first place, the UI language is a much more natural choice for the
> default -s/-u functionality, isn't it?

Makes sense.

> Therefore, afaics, it would be better if we change locale to use the
> GetFooDefaultUILanguage functions by default, and we add a modifier
> (-r/--region?) to switch to LOCALE_FOO_DEFAULT.
>
> Either way, the usage output will have to be improved.  Maybe we should
> explicitely state that the values printed refer to the Windows values,
> and that one of them is the UI locale and the other is the... hmm...
> how to say it..., maybe the "region settings locale" or so.

How about having one option for each of the Windows settings, and
dividing the help output into groups, like so:

POSIX locale options:
  -a, --all-localesList all available supported locales
  -c, --category-name  List information about given category NAME
  -k, --keyword-name   Print information about given keyword NAME
  -m, --charmaps   List all available character maps

Windows locale options:
  -u, --user-lang  Print user default UI language
  -s, --system-langPrint system default UI language
  -f, --format Print user format setting for times, numbers & currency
  -n, --non-unicodePrint system locale for non-Unicode programs
  -U, --utfAttach ".UTF-8" to the result

Other options:
  -v, --verboseMore verbose output
  -h, --help   This text


>> Looking at those, and if we wanted to base the Cygwin locale settings
>> on the Windows ones, I think LC_NUMERIC, LC_TIME, and LC_MONETARY
>> should be determined by LOCALE_USER_DEFAULT, but LC_MESSAGES should be
>> determined by GetUserDefaultUILanguage(). Not sure about LC_CTYPE and
>> LC_COLLATE, but I suppose it would make sense for character
>> classification and sorting to match the UI language.
>
> The system should not set the LC_xxx values at all.  From my POV the
> system should only default to some $LANG, while setting the LC_xxx
> values is the job of the user if the $LANG value doesn't suffice.

Not sure about that, but it's not really worth discussing unless we do
decide to follow the Windows setting(s) by default.

Andy


Re: Add support for creating native windows symlinks

2011-12-04 Thread Andy Koppe
On 4 December 2011 07:07, Russell Davis wrote:
> This was discussed before here:
> http://cygwin.com/ml/cygwin/2008-03/msg00277.html
>
> These were the reasons given for not using native symlinks to create
> cygwin symlinks, along with my responses:
>
> - By default, only administrators have the right to create native
>   symlinks.  Admins running with restricted permissions under UAC don't
>   have this right.
>
> This is true, however the feature can be made optional through the
> CYGWIN environment variable (just like winsymlinks). For users that
> can add the permission or disable UAC, the use of native symlinks is a
> huge step towards making cygwin more unified with the rest of Windows.
>
> - When creating a native symlink, you have to define if this symlink
>   points to a file or a directory.  This makes no sense given that
>   symlinks often are created before the target they point to.
>
> Also true. However, the type only matters for Windows' usage of the
> symlink -- cygwin already treats both the types the same. For example,
> if a native symlink of type `file` actually points to a directory, it
> will still work fine inside cygwin. It won't work for Win32 programs
> that try to access it, but that's still no worse than the status quo
> -- Win32 programs already can't use cygwin symlinks.
>
> Since cygwin already supports reading of native symlinks, I was able
> to add support for this with a fairly small change. Some edge cases
> probably still need to be handled (disabling for older versions of
> windows and unsupported file systems), but I wanted to get this out
> there for review. The patch is attached.

Those aren't all the issues with using native symlinks as Cygwin
symlinks. POSIX symlinks of course are supposed to point to POSIX
paths, whereas native links point to Windows paths, with the following
consequences:

- Native links can't point to special Cygwin paths such as /proc and
/dev, although I guess that could be fudged.
- If the meaning of the POSIX path changes due to Cygwin mount point
changes, native symlinks won't reflect that and point to the wrong
thing.
- Native relative links can't cross drive boundaries, whereas relative
POSIX paths can reach the whole filesystem.

I think the better approach here is to have an ln-like utility that
creates Windows symlinks, as proposed by Daniel Colascione at
http://cygwin.com/ml/cygwin/2011-04/msg00059.html. Perhaps it could be
added to cygutils if it was knocked into appropriate shape. (The main
advantage over using Windows facilities, in particular cmd.exe's
mklink builtin, would be an ln-like interface and Cygwin charset
support.)

Andy


Re: console enhancements: mouse events

2009-11-07 Thread Andy Koppe
2009/11/7 Corinna Vinschen:
>> Mintty roughly does the following for Ctrl(+Shift)+symbol combinations:
>> - obtain the keymap using GetKeyboardState()
>> - set the state of the Ctrl key to released
>> - invoke ToUnicode() to get the character code according to the keyboard 
>> layout
>> - if the character code is one of [\]_^? send the corresponding control code
>> - otherwise, set the state of both Ctrl and Alt to pressed (this is
>> equivalent to AltGr), and try ToUnicode() again
>>
>> The last step means that e.g. Ctrl+9 on a German keyboard will send
>> ^]. The proper combination would be Ctrl+AltGr+9, but since
>> AltGr==Ctrl+Alt, that can't be distinguished from AltGr+9 without
>> Ctrl. (Well, not without somewhat dodgy trickery anyway.)
>
> How does that work for ^^?  The ^ key is a deadkey on the german keyboard
> layout, so the actual char value is only generated after pressing the key
> twice.  Just curious.

ToUnicode actually delivers the ^ character right away when pressing
the key, but with return value -1 to signify that it's a dead key and
that the next key will be modified accordingly. So for Ctrl+^, mintty
sends ^^ right away and then clears the dead key state using a trick
picked up from http://blogs.msdn.com/michkap/archive/2006/04/06/569632.aspx:
feed VK_DECIMALs into ToUnicode until it stops returning dead-key
characters.

(Yep, it's a terrible API for an unintuitive feature. See also
http://blogs.msdn.com/michkap/archive/2005/01/19/355870.aspx)

Andy


Re: console enhancements: mouse events

2009-11-08 Thread Andy Koppe
Thomas Wolff:
>>>  Note: This works on my home PC (Windows XP Home) but it's not effective
>>>  on my work PC (Windows XP Professional) where the mouse wheel scrolls the
>>>  Windows console (which it doesn't on the other machine); I don't know  how
>>> to disable or configure this.

I've come across a similar issue in mintty: on some machines, if the
scrollbar is enabled, mousewheel events never reach the window's event
loop. I never really got to the bottom of it, but I think it's to do
with mouse drivers: some appear to send mousewheel events straight to
a window's scrollbar if there is one, no matter where in the window
the mouse is positioned.

>> [Ctrl+AltGr+key stuff]

> Thanks Andy for pointing to the part of mintty code handling this. However,
> the whole function there looks too complex for a quick copy-paste-patch.

Nested functions, big switches, and even a couple of gotos, that
function has it all. ;)

> Maybe later... or Andy might like to factor out the mapping part in a way
> directly reusable for the cygwin console?

Erm, no. The crucial subfunction there is undead_keycode().

> It
> does not work however, even for ASCII characters, for characters produced
> with AltGr, e.g. Alt-AltGr-Q where AltGr-Q is @ (German keyboard). Andy got
> this to work in mintty (I think with some other subtle trick after I
> challenged him for it IIRC); it does not work in xterm either.

You can tell whether both Alt and AltGr are down by checking VK_LMENU
and VK_RMENU.

Andy


making scanf byte-clean(er)

2010-01-10 Thread Andy Koppe
Attached is a patch for making the scanf format string (more)
byte-transparent. It actually couldn't deal with non-ASCII chars at
all, even valid ones, due to comparing an 'unsigned char' with a
(signed) 'char'. And when encountering an invalid byte, it would go
backwards in the format string. Finally, it wrongly reset the
multibyte conversion state for every character and used the same state
object for the format string and %ls arguments.

I thought I'd send the patch here for review first before sending it
upstream. Hope that makes sense.

Andy


scanf.patch
Description: Binary data


minor doc corrections

2010-01-24 Thread Andy Koppe
Attached is a patch with some minor locale-related doc corrections.
Mostly just typos and removing stuff that no longer applies.

Andy


doc.patch
Description: Binary data


[PATCH] internal_setlocale tweak

2010-02-10 Thread Andy Koppe
winsup/cygwin/ChangeLog:
* nlsfuncs.cc (internal_setlocale, initial_setlocale):
Move check whether charset has changed to internal_setlocale,
to avoid unnecessary work when invoked via CW_INT_SETLOCALE.

Sufficiently trivial, I hope.

Andy


int_setlocale.patch
Description: Binary data


Re: console enhancements: mouse events etc

2010-03-30 Thread Andy Koppe
> How can I enforce printing garbage so I
> can test the reset command?

echo $'\e(0'


[PATCH] Cygwin: Correct /proc/*/stat for processes without ctty

2022-11-09 Thread Andy Koppe
Hi,

I had noticed that selecting or excluding processes without a
controlling terminal doesn't work in procps on Cygwin.

For example, the mintty process shouldn't appear in the following, as
the 'f' (for forest) argument triggers procps's "BSD personality",
where processes without a controlling terminal are supposed to be
excluded by default:

$ procps f
PID TTYSTAT  STIME COMMAND
   1809 ?  Ss19:49 /usr/bin/mintty -
   1810 pty0   Ss19:49  \_ -zsh
   2075 pty0   R 21:14  \_ procps f

Similarly, this should list the processes without a terminal, but
comes up empty:

$ procps -t -

I tracked this down to a difference in the tty field of /proc/*/stat
(which is the 7th field). On Linux, processes without a terminal have
value 0 there, and that's what procps expects.

Cygwin 3.3 has -1 instead, whereas on master the bits of the tty field
were rearranged in commit 437d0a8f88, which turns the -1 into
268435455 (i.e. 0xFFF). Either way, procps treats such processes
as having terminals. (The ? in the TTY column output is generated by a
different code path in procps that uses /proc/*/ctty on Cygwin.)

Patches for the 3.3 branch and master attached.

Regards,
Andy


0001-Cygwin-proc-stat-ctty.patch
Description: Binary data


0001-Cygwin-proc-stat-ctty-3.3.patch
Description: Binary data