[issue1609] test_re.py fails

2007-12-20 Thread Ismail Donmez
Ismail Donmez added the comment: Tried like , unicode("iii").encode("iso-8859-9").upper() doesn't work, I'll ask on python users list. Thanks. __ Tracker <[EMAIL PROTECTED]> __ __

[issue1609] test_re.py fails

2007-12-20 Thread Martin v. Löwis
Martin v. Löwis added the comment: > print "".encode("iso-8859-9").upper().decode("iso-8859-9") > does not Please get your types right. "" is a byte string (in Python 2.x). encode: unicode -> string decode: string -> unicode That you still can apply .encode to the byte string is a bug/p

[issue1609] test_re.py fails

2007-12-20 Thread Ismail Donmez
Ismail Donmez added the comment: I guess so, I will no longer spam this bug. Thanks for the suggestions. __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bug

[issue1609] test_re.py fails

2007-12-20 Thread Guido van Rossum
Guido van Rossum added the comment: > Funnily, > > print "".encode("iso-8859-9").decode("iso-8859-9").upper() > > works, but > > print "".encode("iso-8859-9").upper().decode("iso-8859-9") > > not. You'll have to debug this yourself. __ Tracker <[EMAIL PRO

[issue1609] test_re.py fails

2007-12-20 Thread Ismail Donmez
Ismail Donmez added the comment: Funnily, print "".encode("iso-8859-9").decode("iso-8859-9").upper() works, but print "".encode("iso-8859-9").upper().decode("iso-8859-9") not. __ Tracker <[EMAIL PROTECTED]> __

[issue1609] test_re.py fails

2007-12-20 Thread Guido van Rossum
Guido van Rossum added the comment: Two easy ways to get the functionality using 8-bit strings, assuming you've already set your locale properly: (1) If your data is already an 8-bit string (i.e. isinstance(data, str)), simply use data.upper() or data.lower() (2) If your data is Unicode (i.e. i

[issue1609] test_re.py fails

2007-12-20 Thread Ismail Donmez
Ismail Donmez added the comment: Hi Martin, Actually the only problem is how can I get wctype functionality with 8-bit strings, any example is appreciated. This bug itself is invalid because --with-wctype-functions is deprecated. But as I said I just hope removing that doesn't regress Turkish f

[issue1609] test_re.py fails

2007-12-19 Thread Martin v. Löwis
Martin v. Löwis added the comment: I think too many issues get mixed in this report. I would like to ignore all but one issue, but I don't understand what the one issue is that this report should deal with. cartman, when you compare Python 2.4 and 2.5, could it be that the 2.4 Python was compile

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: > Ok then what is the suggested way to get back the Turkish way of doing > upper/lower on i & I ? That's a question for Martin von Loewis. I suppose you could use 8-bit strings exclusively. Or you could use .translate() with a custom dict. __

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Ok then what is the suggested way to get back the Turkish way of doing upper/lower on i & I ? __ Tracker <[EMAIL PROTECTED]> __ ___

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: > But it should be affected by locale, thats the point of locale.setlocale > call. This is how libc's wc functions behave. No, the locale should only affect 8-bit string operations, never unicode operations. __ Tracker <[EMAIL

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: But it should be affected by locale, thats the point of locale.setlocale call. This is how libc's wc functions behave. __ Tracker <[EMAIL PROTECTED]> __ ___

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: > print u"\u0069".upper() > > should give \u0130 (LATIN CAPITAL LETTER I WITH DOT ABOVE) > > print u"\u0049".lower() > > should give \u0131 (LATIN SMALL LETTER DOTLESS I) > > These transformations work fine with python2.5 when > --with-wctype-functions is used.

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Ok that was because we had modified default encoding in Lib/site.py to be utf-8. Sorry! The only problem left is last 2 conversions in test.py gives wrong results when wctypes is disabled, that is : print u"\u0069".upper() should give \u0130 (LATIN CAPITAL LETT

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: > Replacing Turkish characters with hex versions in test2.py still results > in UnicodeDecodeError and works with python 2.4. I'm hoping Martin can confirm this, but I suspect that this is due to a tightening of the rules for converting from 8-bit strings to u

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Replacing Turkish characters with hex versions in test2.py still results in UnicodeDecodeError and works with python 2.4. __ Tracker <[EMAIL PROTECTED]> __

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: Hm. The test2.py file, when I download it, contains the two bytes "\xc4\xb1" in the first unicode() call, and "\xc4\xb0" in the second one. This is *always* supposed to produce a UnicodeDecodeError, since it would use the default encoding which is ASCII. So

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: So in conclusion, - Enabling wctypes makes Turkish support work with \u syntax, breaks unicode() - Disabling wctypes breaks Turkish support with \u and/or unicode() Attached test.py tests Turkish corner cases of lower()/upper() . Correct output is which python 2

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Test works fine when using the \u syntax. You have to use the unicode() with Turkish characters to get the error. See attached test2.py With python 2.4 : [~]> python test2.py Following should print I I Following should print i i With python 2.5 SVN : [~/pytho

[issue1609] test_re.py fails

2007-12-19 Thread Guido van Rossum
Guido van Rossum added the comment: Martin, can you have a look at this? Cartman, can you produce a unittest for the correct behavior that only uses ASCII input (using \u instead of just typing Turkish characters)? -- assignee: -> loewis nosy: +loewis _

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Situation is even more complicated, following functions behave _correctly_ when wctypes is enabled : >>> print unicode("i").upper() İ >>> print unicode("").lower() Following doesn't work even if wctypes is enabled : >>> print unicode("").up

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Indeed there seems to be regressions: Python 2.4 : [~]> python Python 2.4.4 (#1, Oct 23 2007, 11:25:50) [GCC 3.4.6] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import locale >>> locale.setlocale(locale.LC_ALL,"") 'tr_TR.U

[issue1609] test_re.py fails

2007-12-19 Thread Ismail Donmez
Ismail Donmez added the comment: Python README says --with-wctype-functions is deprecated and will be removed in Python 2.6 , I don't think its worth to fix it now. Also test failures with --with-wctype-functions is seems to be known according to Google. What I wonder if removing --with-wctype-f

[issue1609] test_re.py fails

2007-12-17 Thread Guido van Rossum
Guido van Rossum added the comment: Focus on how using --with-wctype-functions changes things and how this could affect the regex implementation. (I wouldn't be surprised if the other failing tests were to to the regex bugs.) __ Tracker <[EMAIL PROTECTED]>

[issue1609] test_re.py fails

2007-12-14 Thread Ismail Donmez
Ismail Donmez added the comment: Any ideas/comments on how to move forward with this? Thanks, ismail __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-l

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Ismail Donmez added the comment: Remove test_ucn from the list, it still fails but its for another bug report. __ Tracker <[EMAIL PROTECTED]> __ ___ Pyth

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Ismail Donmez added the comment: Removing --with-wctype-functions in total fixes following regression tests, test_codecs test_re test_ucn test_unicodedata __ Tracker <[EMAIL PROTECTED]> __

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Ismail Donmez added the comment: > Not quite yet, gcc 4.3 had a big inlining bug that was just corrected > two weeks ago: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33434 > You may have encountered this bug, or another similar one... Two weeks ago is too old for me, I am using SVN snapshot fr

[issue1609] test_re.py fails

2007-12-13 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > > Is GCC 4.3 released yet? > > Not yet but soon, its less buggy compared to 4.1 and 4.2 > at the moment. Not quite yet, gcc 4.3 had a big inlining bug that was just corrected two weeks ago: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33434 You may have

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Ismail Donmez added the comment: > What system libraries? libpython2.5.so.1.0 , this is a shared lib build after all. > Does it make a difference if you don't specify either of > > --enable-unicode=ucs4 \ > --with-wctype-functions Removing --with-wctype-functions fixes the issue. > Is GCC 4.3

[issue1609] test_re.py fails

2007-12-13 Thread Guido van Rossum
Guido van Rossum added the comment: > Without LD_LIBRARY_PATH it would use the system libraries and not the > compiled ones which anyway is not wanted. What system libraries? Does it make a difference if you don't specify either of --enable-unicode=ucs4 \ --with-wctype-functions ? Is GCC 4.

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Ismail Donmez added the comment: gcc 4.3, Linux 2.6.18, 32bit. Without LD_LIBRARY_PATH it would use the system libraries and not the compiled ones which anyway is not wanted. Configure line used is (damn I forgot to specify this before, sorry) --with-fpectl \ --enable-shared \ --enable-ipv6

[issue1609] test_re.py fails

2007-12-13 Thread Guido van Rossum
Guido van Rossum added the comment: Can't reproduce. Like before, what platform, compiler etc.? Does using ./configure --with-pydebug make a difference? What's the LD_LIBRARY_PATH for? -- nosy: +gvanrossum __ Tracker <[EMAIL PROTECTED]>

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
Changes by Ismail Donmez: -- type: -> behavior __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.p

[issue1609] test_re.py fails

2007-12-13 Thread Ismail Donmez
New submission from Ismail Donmez: Using python 2.5 revision 59479 from release25-maint branch, [~/python-2.5]> LD_LIBRARY_PATH=/home/cartman/python-2.5: ./python ./Lib/test/test_re.py test_anyall (__main__.ReTests) ... ok test_basic_re_sub (__main__.ReTests) ... ok test_bigcharset (__main__.Re