[ python-Bugs-1293741 ] doctest runner cannot handle non-ascii characters

SourceForge.net Wed, 09 May 2007 01:19:29 -0700

Bugs item #1293741, was opened at 2005-09-17 14:41
Message generated for change (Comment added) made by akaihola
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1293741&group_id=5470


Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: GRISEL (ogrisel)
Assigned to: Nobody/Anonymous (nobody)
Summary: doctest runner cannot handle non-ascii characters 

Initial Comment:
The doctest module fails when the expected result
string has non-ascii charcaters even if the # -*-
coding: XXX -*- line is properly set.

The enclosed code sample produce the following error:

Traceback (most recent call last):
  File "test_iso-8859-15.py", line 41, in ?
    _test()
  File "test_iso-8859-15.py", line 26, in _test
    tried, failed = runner.run(t)
  File "/usr/lib/python2.4/doctest.py", line 1376, in run
    return self.__run(test, compileflags, out)
  File "/usr/lib/python2.4/doctest.py", line 1259, in __run
    if check(example.want, got, self.optionflags):
  File "/usr/lib/python2.4/doctest.py", line 1475, in
check_output
    if got == want:
UnicodeDecodeError: 'ascii' codec can't decode byte
0xe9 in position 8: ordinal not in range(128)



----------------------------------------------------------------------

Comment By: akaihola (akaihola)
Date: 2007-05-09 11:19

Message:
Logged In: YES 
user_id=1432932
Originator: NO

I made some tests with Python 2.5 on an Ubuntu Edgy system with an UTF-8
terminal. Here's the basic test which does work correctly:

# -*- encoding: utf-8 -*-
__doc__ = u"""
>>> print u'ä'
ä
""" ; import doctest ; doctest.testmod()

If I start to vary the "ä" (a with umlaut) characters in "print u'ä'"
(the test) and the "ä" below it (expected result), I get a
UnicodeEncodeError whenever doctest tries to print a message about
non-matching test output.

Here's a summary of my results in the format of
test | expected result | success/failure
Note that \u00e4 is unicode for the "ä" character.

ä      | ä      | success
\u00e4 | ä      | success
ä      | \u00e4 | success
\u00e4 | \u00e4 | success
ä      | x      | fails to display message
x      | ä      | fails to display message
\u00e4 | x      | fails to display message
x      | \u00e4 | fails to display message

Conclusion: test running and output checking do work correctly, but
there's a problem displaying messages about non-matching output whenever
either the expected output or the output produced by the test contain any
extended characters.

The doctest documentation doesn't give any hint on work-arounds.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2006-04-24 04:21

Message:
Logged In: YES 
user_id=31435

Unassigned myself -- don't know enough about encodings.

----------------------------------------------------------------------

Comment By: Bjorn Tillenius (bjoti)
Date: 2006-02-16 13:41

Message:
Logged In: YES 
user_id=1032069

I'm quite sure that you can use non-ASCII characters in 
your doctest, given that it's a unicode string. So if you 
make your docstring a unicode string, it should work. That 
is:

u"""Docstring containing non-ASCII characters.
...
"""


----------------------------------------------------------------------

Comment By: GRISEL (ogrisel)
Date: 2005-09-18 13:25

Message:
Logged In: YES 
user_id=795041

Unfortunateny that patch does not fix my problem. The patch
at bug #1080727 fixes the problem for doctests written in
external reST files (testfile and DocFileTest functions). My
problem is related to internal docstring encoding (testmod
for instance). However, Bjorn Tillenius says:
"""
If one writes doctests within documentation strings of
classes and
functions, it's possible to use non-ASCII characters since
one can
specify the encoding used in the source file.
"""
So according to him, docstrings' doctests with non-ascii
characters should work by default. So maybe my system setup
is somewhat broken. Could somebody please confirm/infirm
this by running the attached sample script on his system?

My system config:
[EMAIL PROTECTED] (on linux)
python 2.4.1 with:  sys.getdefaultencoding() == 'ascii' 
and locale.getpreferredencoding() == 'ISO-8859-15'
$ file test_iso-8859-15.py
test_iso-8859-15.py: ISO-8859 English text


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-09-17 20:42

Message:
Logged In: YES 
user_id=31435

Please try the patch at

http://www.python.org/sf/1080727

and report back on whether it solves your problem (attaching 
comments to the patch report would be most useful).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1293741&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[ python-Bugs-1293741 ] doctest runner cannot handle non-ascii characters

Reply via email to