On Thu, Oct 1, 2009 at 8:11 AM, gentlestone <tibor.b...@hotmail.com> wrote:
> > [snip] > My question is. Can anybody explain, what does it mean? How should I > rewrite my doctests in above way? How this piece of code should be? > > def slugify(name): > u""" > >>> slugify(u'Žabovitá zmiešaná kaša s.r.o') > u'zabovita-zmiesana-kasa-sro' > """ > for key, value in _MAP.iteritems(): > name = name.replace(key, value) > return defaultfilters.slugify(name > ) > The doctest runner has an open problem with unicode literal docstrings: if they contain non-ASCII characters, attempting to output a failure message runs into trouble. So instead of getting a message saying this was expected but that was received, you get a message saying that the AssertionError object is unprintable. So you know the test failed, but you have no idea why. This problem is logged in the Python issue tracker: http://bugs.python.org/issue1293741 There's a patch on that issue that fixes the problem, at least for environments where stdout has an encoding that is capable of representing the characters that need to be output. With the last patch attached to that Python bug applied to Django's copy of _doctest.py, your test above (modified to report a failure by changing the expected output), successfully reports the failure on my Linux box: ---------------------------------------------------------------------- File "/home/kmt/software/web/playground/ttt/models.py", line 133, in ttt.models.slugify Failed example: slugify(u'Žabovitá zmiešaná kaša s.r.o') Expected: u'!!!zabovita-zmiesana-kasa-sro' Got: u'zabovita-zmiesana-kasa-sro' ---------------------------------------------------------------------- However, using exactly the same code on a Windows box you still can't see the failure reported properly because the Windows box uses a different stdout encoding that is unable to represent the characters in the unicode literal docstring: ====================================================================== ERROR: Doctest: ttt.models.slugify ---------------------------------------------------------------------- Traceback (most recent call last): File "d:\u\kmt\django\trunk\django\test\_doctest.py", line 2187, in runTest clear_globs=False) File "d:\u\kmt\django\trunk\django\test\_doctest.py", line 1409, in run return self.__run(test, compileflags, out) File "d:\u\kmt\django\trunk\django\test\_doctest.py", line 1316, in __run self.report_failure(out, test, example, got) File "d:\u\kmt\django\trunk\django\test\_doctest.py", line 1184, in report_failure self._checker.output_difference(example, got, self.optionflags)) File "d:\u\kmt\django\trunk\django\test\_doctest.py", line 2186, in <lambda> test, out=lambda x: new.write(x.encode(output_encoding)), File "d:\bin\Python2.5.2\lib\encodings\cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\u017d' in position 182: character maps to <undefined> So, that fix doesn't make it possible to write doctests that will correctly report failures cross-platform. (It also possibly causes other problems that I've seen while experimenting with it, but I don't have time to track them down at the moment....but I'm unconvinced that fix alone will cure all problems with docstrings and non-ASCII chars.) What you can do to avoid the problem is not use unicode literal docstrings. So get rid of the u in front of the docstring literal for slugify. Then doctest won't run into trouble attempting to auto-convert from unicode to bytestring for output. But then you'll have another problem, because the embedded unicode literal in the docstring won't be built using the proper encoding, causing a failure on what should be success: ---------------------------------------------------------------------- File "/home/kmt/software/web/playground/ttt/models.py", line 133, in ttt.models.slugify Failed example: slugify(u'Žabovitá zmiešaná kaša s.r.o') Expected: u'zabovita-zmiesana-kasa-sro' Got: u'a12abovita-zmieaana-kaaa-sro' To fix that, remove the dependence on an embedded unicode literal in the docstring. That is, create a unicode object by explicitly decoding a bytestring using the proper codec: """ >>> slugify('Žabovitá zmiešaná kaša s.r.o'.decode('utf-8')) u'zabovita-zmiesana-kasa-sro' """ A big ugly, but that version will pass when it is supposed to, and will be able to report a descriptive failure message across different platforms. Karen --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---