* David Pashley [2006-06-30 18:47:41+0100]
> I wonder. I know there was a bug involving tr_TR locale regarding
> uppercasing/lowercasing of characters. 

Yep, I strongly suspect that the same upper/lowercasing + strcasecmp issues
cause this bug[1].

    [EMAIL PROTECTED]:~/tmp/irssi/irssi-0.8.10$ grep -R g_strcasecmp * | wc -l
    186

While working on this issue, I'm sending some test progs showing the
behaviour of glib's g_strcasecmp with some problematic i/ı/İ/I input
strings, for your convenience.

[1] For more info: http://www.i18nguy.com/unicode/turkish-i18n.html

-- 
roktas

Attachment: test-turkish.sh
Description: Bourne shell script

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <locale.h>
#include <glib.h>

#define DUMP(s1, s2, call) \
    (g_print("s1: %s\ts2: %s\t%s ==> %d\n",  (s1), (s2), #call, (call)))

int
main (int argc, char **argv)
{
	gchar *s1 = argv[1];
	gchar *s2 = argv[2];

	if (argc < 3) {
		fprintf (stderr, "Usage: %s <string1> <string2>\n", argv[0]);
		exit (EXIT_FAILURE);
	}

	setlocale (LC_ALL, "");
	g_print ("Locale: %s\n", setlocale (LC_ALL, NULL));

	DUMP (s1, s2, strlen (s1));
	DUMP (s1, s2, g_utf8_strlen (s1, -1));
	DUMP (s1, s2, strcasecmp (s1, s2));
	DUMP (s1, s2, g_strcasecmp (s1, s2));
	DUMP (s1, s2, g_ascii_strcasecmp (s1, s2));
	DUMP (s1, s2,
	    g_strcasecmp (g_utf8_strup (s1, 128), g_utf8_strup (s2, 128)));
	DUMP (s1, s2,
	    g_strcasecmp (g_utf8_strdown (s1, 128), g_utf8_strdown (s2, 128)));

	exit (EXIT_SUCCESS);
}
Locale: tr_TR.UTF-8
s1: ascii       s2: ASCII       strlen (s1) ==> 5
s1: ascii       s2: ASCII       g_utf8_strlen (s1, -1) ==> 5
s1: ascii       s2: ASCII       strcasecmp (s1, s2) ==> 32
s1: ascii       s2: ASCII       g_strcasecmp (s1, s2) ==> 32
s1: ascii       s2: ASCII       g_ascii_strcasecmp (s1, s2) ==> 0
s1: ascii       s2: ASCII       g_strcasecmp (g_utf8_strup (s1, 128), 
g_utf8_strup (s2, 128)) ==> 123
s1: ascii       s2: ASCII       g_strcasecmp (g_utf8_strdown (s1, 128), 
g_utf8_strdown (s2, 128)) ==> -91

Locale: tr_TR.UTF-8
s1: ascii       s2: ASCİİ       strlen (s1) ==> 5
s1: ascii       s2: ASCİİ       g_utf8_strlen (s1, -1) ==> 5
s1: ascii       s2: ASCİİ       strcasecmp (s1, s2) ==> -91
s1: ascii       s2: ASCİİ       g_strcasecmp (s1, s2) ==> -91
s1: ascii       s2: ASCİİ       g_ascii_strcasecmp (s1, s2) ==> -91
s1: ascii       s2: ASCİİ       g_strcasecmp (g_utf8_strup (s1, 128), 
g_utf8_strup (s2, 128)) ==> 0
s1: ascii       s2: ASCİİ       g_strcasecmp (g_utf8_strdown (s1, 128), 
g_utf8_strdown (s2, 128)) ==> -99

Locale: tr_TR.UTF-8
s1: ascıı       s2: ASCII       strlen (s1) ==> 7
s1: ascıı       s2: ASCII       g_utf8_strlen (s1, -1) ==> 5
s1: ascıı       s2: ASCII       strcasecmp (s1, s2) ==> 123
s1: ascıı       s2: ASCII       g_strcasecmp (s1, s2) ==> 123
s1: ascıı       s2: ASCII       g_ascii_strcasecmp (s1, s2) ==> 91
s1: ascıı       s2: ASCII       g_strcasecmp (g_utf8_strup (s1, 128), 
g_utf8_strup (s2, 128)) ==> 0
s1: ascıı       s2: ASCII       g_strcasecmp (g_utf8_strdown (s1, 128), 
g_utf8_strdown (s2, 128)) ==> 0

Locale: tr_TR.UTF-8
s1: ascıı       s2: ASCİİ       strlen (s1) ==> 7
s1: ascıı       s2: ASCİİ       g_utf8_strlen (s1, -1) ==> 5
s1: ascıı       s2: ASCİİ       strcasecmp (s1, s2) ==> 1
s1: ascıı       s2: ASCİİ       g_strcasecmp (s1, s2) ==> 1
s1: ascıı       s2: ASCİİ       g_ascii_strcasecmp (s1, s2) ==> 1
s1: ascıı       s2: ASCİİ       g_strcasecmp (g_utf8_strup (s1, 128), 
g_utf8_strup (s2, 128)) ==> -123
s1: ascıı       s2: ASCİİ       g_strcasecmp (g_utf8_strdown (s1, 128), 
g_utf8_strdown (s2, 128)) ==> 91

Attachment: signature.asc
Description: Digital signature

Reply via email to