Control: tags -1 pending

On 2014-02-12 08:56:03, Felix Dreissig wrote:
> Package: monkeysign
> Version: 2.x
> Severity: normal
>
> I wanted to build the manpage only for Monkeysign’s CLI version, so I removed 
> `monkeyscan:monkeysign.gtkui:MonkeysignScanUi.parser` from ‘setup.cfg' and 
> ran `setup.py build_manpage`.
> That failed with:
>
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 55: 
>> ordinal not in range(128)
>
> An encoding problem didn’t make any sense to me, so I tried to track the 
> issue down. Turns out it doesn’t occur when PyGTK is imported into the build 
> process, either directly through 'gtkui.py' or via 'msg_exception.py'.
> The explanation for this behaviour is that PyGTK sets Python’s default 
> encoding to UTF-8. This is GNOME bug 132040 from back in 2004: 
> https://bugzilla.gnome.org/show_bug.cgi?id=132040
>
> So what exactly causes the above error?
> It is the accent in your surname, anarcat, that causes manpage writing to 
> fail with ASCII encoding ;-). The best way to fix this would in my opinion be 
> using an unicode string for `author` in 'setup.py', but Disutils seem not to 
> respect that.
damn french. ;)

i agree that author should be unicode, no idea while distutils is
dropping that to the floor. oh well.

> I used the following patch, which works:
>
>> --- a/monkeysign/documentation.py
>> +++ b/monkeysign/documentation.py
>> @@ -84,7 +84,7 @@ class build_manpage(Command):
>>      def _write_footer(self, parser):
>>          ret = []
>>          appname = self.distribution.get_name()
>> -        author = '%s <%s>' % (self.distribution.get_author(),
>> +        author = '%s <%s>' % 
>> (self.distribution.get_author().decode('utf-8'),
>>                                self.distribution.get_author_email())
>>          ret.append(('.SH AUTHORS\n.B %s\nwas written by %s.\n'
>>                      % (self._markup(appname), self._markup(author))))
>> @@ -109,7 +109,7 @@ class build_manpage(Command):
>>              path = os.path.join(self.output, parser.prog + '.1')
>>              self.announce('writing man page to %s' % path, 2)
>>              stream = open(path, 'w')
>> -            stream.write(''.join(manpage))
>> +            stream.write(''.join(manpage).encode('utf-8'))
>>              stream.close()

I used a slight variation, i decode in the ret.append() call so that the
email can also contain accents, which may be illegal, but I don't care:
i'm not going to go enforcing standards here, i want to avoid crashes at
build time. :)

> It might, however, not be the most comprehensive way to deal with the issue: 
> The whole process of generating manpages uses a mixture of ordinary and 
> unicode strings and might need some review with respect to encoding issues.

true. this was messy in the first place, although I am not sure i want
to pursue this much further. :P

thanks for all the patches and help!

a.

-- 
That's the kind of society I want to build. I want a guarantee - with
physics and mathematics, not with laws - that we can give ourselves
real privacy of personal communications.
                         - John Gilmore

Attachment: pgpyhjfp6iEYW.pgp
Description: PGP signature

Reply via email to