bug#21883: unnecessary bit shifting range limits

2018-10-15 Thread Stefan Israelsson Tampe
i think you got it. sorry for the fuzz. Den 15 okt 2018 12:19 AM skrev "Mark H Weaver" : > Stefan Israelsson Tampe writes: > > how would this slow down the code. just add the correction where you > > throw the exception which should be in a branch outside the hot path. > > If you have a suggest

bug#33044: Invalid read access of chars of wide string in scm_seed_to_random_state

2018-10-15 Thread Tom de Vries
Hi, Consider min.c: ... #include #include "libguile.h" static void * foo (void *data) { return NULL; } int main (void) { const char *msg = setlocale (LC_CTYPE, "ja_JP.sjis"); printf ("msg: %s\n", msg); scm_with_guile (foo, NULL); return 0; } ... Compiled with guile-2.2.4: ... $ gcc m

bug#33044: Reproduced using guile binary

2018-10-15 Thread Tom de Vries
Hi, Using a simple scheme hello world: ... $ cat hello.scm (display "hello world") (newline) ... we're able to reproduce the problem using the guile binary: $ LC_CTYPE=ja_JP.sjis /home/vries/guile/2.2/install/bin/guile -s hello.scm Segmentation fault (core dumped) ... [ Note: When using 2.0,

bug#33044: Analysis and proposed patch

2018-10-15 Thread Tom de Vries
Hi, I think there are two independent problems here. --- 1. scm_seed_to_random_state should be able to handle the case that the seed argument is a non-narrow string. 2. The *random-state* variable is documented like this: ... Note that the initial value of *random-state* is the same every

bug#33053: scm_i_mirror_backslashes assumes ASCII-compatible locale encoding

2018-10-15 Thread Mark H Weaver
The 'scm_i_mirror_backslashes' in load.c operates on C strings in the locale encoding, and assumes that the locale encoding is ASCII compatible. In the Shift_JIS encoding, used in the "JP_jp.sjis" locale, backslash '\' is mapped to a multibyte character, and the Yen sign '¥' is represented using c

bug#33053: scm_i_mirror_backslashes assumes ASCII-compatible locale encoding

2018-10-15 Thread Mark H Weaver
Mark H Weaver writes: > The 'scm_i_mirror_backslashes' in load.c operates on C strings in the > locale encoding, and assumes that the locale encoding is ASCII > compatible. In the Shift_JIS encoding, used in the "JP_jp.sjis" locale, > backslash '\' is mapped to a multibyte character, and the Yen

bug#33057: Guile's reader assumes the port encoding is ASCII-compatible

2018-10-15 Thread Mark H Weaver
In several places, Guile's reader assumes that the port encoding is ASCII-compatible. For example: * 'scm_token' reads raw bytes and passes them to the CHAR_IS_DELIMITER macro to check for delimiters. * 'scm_read_mixed_case_symbol' checks for the (optional) postfix keyword syntax by comparin

bug#33044: Guile misbehaves in the "ja_JP.sjis" locale

2018-10-15 Thread Mark H Weaver
retitle 33044 Guile misbehaves in the "ja_JP.sjis" locale thanks Hi Tom, Thanks for the report, analysis and patch. I agree with your analysis, and the patch looks good. However, there's also a much deeper problem here. You found and fixed one occurrence of Guile assuming that the locale encod

bug#33044: Guile misbehaves in the "ja_JP.sjis" locale

2018-10-15 Thread Mark H Weaver
Mark H Weaver writes: > Shift_JIS is _mostly_ ASCII-compatible, except that code points 0x5C and > 0x7E, which represent backslash (\) and tilde (~) in ASCII, are mapped > to the Yen sign (¥) and overline (‾) in Shift_JIS. Backslash (\) and > tilde (~) are multibyte characters in Shift_JIS. Alt